Indian Agricultural 
Research Institute, New Dblhi, 

I. A. R. I. 6. 

MGIPO-flt—fll AR/57—3-4-08—0,000, 




PROCEEDINGS 

or THX 

ROYAL SOCIETY OF EDINBURGH 




PROCEEDINGS 


or 


THE ROYAL SOCIETY 
OF EDINBURGH 


Section A (Mathematical and Physical Sciences) 


VOL. LXII1 


I 949“ I 95 2 


PUBLISHED BY 

OLIVER & BOYD 
EDINBURGH: TWEEDDALE COURT 
LONDON: 98 GREAT RUSSELL STREET, W.C. 1 




CONTENTS 


fAOt 


■a 

# I- The Adventures of an Hypothecs. By James Kendall, M.A., D.Sc., F.R.S., 

.P.R.S.E. (With Two Text-figures.) Issued separately May 17, 1950, . 1 

*2. Tho Stability of Solutions of Non-linear Difference-differential Equations. By 
Professor E. M. Wright, University of Aberdeen. Issued separately 
May 17, 1950.18 

3. The Simplest Form of Second-Order Linear Differential Equation, with 
Periodic Coefficient, having Finite Singularities. By Enzo Cambi. 
Communicated by Professor A. C. Aitlcen, F.R.S. (With Two Text-figures.) 

Issued separately June 16, 1950,.27 

*4- Studies in Practical Mathematics. V. On the Iterative Solution of a System 
of Linear Equations. By A. C. Aitlcen, D.Sc., F.R.S., Mathematical 
Institute, 16 Chambers Street, Edinburgh, 1. Issued separately June 16, 

1950 .. 5 * 

5. Lea transformations asymptotiquement presque plriodiques discontinues et le 

lemme ergodique. (Premifcre Note.) Par Maurice Frfchet, Hon.F.R.S.E., 
University de Paris, k la Sorbonne. Communicated by Sir Edmund 
Whittaker, F.R.S. Issued separately June 16, 1950, .... 61 

6. Unbiased Statistics with Minimum Variance. By A. Bhattacharyya, 

Statistical Laboratory, Calcutta. Communicated by Professor A. C. 
Aitken, F.R.S. Issued separately June 16, 1950,.69 

7. Parallel Planes in a Riemannian V% . By H. S. Ruse, The University, Leeds. 

Issued separately June 16, 1950.78 

*8. A Further Note on a Problem in Factor Analysis. By D. N. Lawley, M.A., 

D.Sc., University of Edinburgh. Issued separately June 16, 1950, . 93 

*9. A Measurement of the Velocity of Light. By R. A. Houstoun, M.A., D.Sc., 
F.Inst.P., Natural Philosophy Department, University of Glasgow. (With 
Four Text-figures.) Issued separately June 16, 1950, .... 95 

*10. The Reciprocity Theory of Electrodynamics. By H. S. Green and K. C. 
Cheng, University of Edinburgh. Communicated by Professor Max Bom, 
F.R.S. Issued separately May 14, 1951,.105 

# ii. Application of Relaxation Methods to Compressible Flow past a Double 
Wedge. By A. R. Mitchell, Ph.D., and D. E. Rutherford, Dr.Math., D.Sc., 
United College, University of St Andrews. (With Seven Text-figures.) 
Issued separately May 14. . *39 


V 





vi 


Comttmts 


HO. 

*12. Clebsch-Aronhold Symbols and the Theory of Symmetric Functions. By 
H. W. Turnbull and A. H. Wallace, The University, St Andrews. Issued 
separately May 14, 1951, . *. 

*13. Studies in Practical Mathematics. VI. On the Factorization of Polynomials 
by Iterative Methods. By A. C. Aitken, D.Sc., F.R.S., Mathematical 
Institute, University of Edinburgh. Issued separately May 14, 1951. 

*14. Experiments in Diffraction Microscopy. By G. L. Rogers, M.A., Ph.D., 
Department of Physics, University College, Dundee, Angus. Com- 
munieated by Professor G, D, Preston. (With Two Plates and Nine 
Text-figures.) Issued separately February 9, 1952, .... 

*1$. Theorems on the Convergence and Asymptotic Validity of Abel's Series. 
By A. J. Macintyre and Sheila Scott Macintyre, University of Aberdeen. 
Issued separately February 9, .. 

*16. Some Continuant Determinants arising in Physics and Chemistry—II. By 
D. E. Rutherford, D.Sc., Dr.Math., United College, University of St 
Andrews. Issued separately February 9, 1952, ..... 

*17. The Sargent Diagram for the Electron-capture Process, and the Disintegration 
Energies of Heavy /?-emitters. By N. Feather, Department of Natural 
Philosophy, University of Edinburgh. Issued separately February 9, 
1952. 

*18. The Elementary Proof of the Prime Number Theorem. By E. M. Wright, 
University of Aberdeen. Issued separately May 30, 1952, 

19. A Generalization of the Classical Random-walk Problem, and a Simple Model 

of Brownian Motion based Thereon. By G. Klein, Ph.D. ( Birkbcck 
College, London, now at University Libre dc Bruxelles. Communicated by 
Dr R. Forth. Issued separately May 30, 1952. 

20. On the Estimation of Variance and Covariance. By E. H. Lloyd, Imperial 

College, London. Communicated by Professor H. Levy. Issued separately 
May 30, 1952,. 

21. The Statistical Theory of Stiff Chains. By H. E. Daniels, M.A.(Cantab.), 

Ph.D.(Edin.), Statistical Laboratory, University of Cambridge. Com - 
munieated by Professor A. C. Aitken, F.R.S. Issued separately May 30, 
1952 . 

*22. Artificial Holograms and Astigmatism. By G. L. Rogers, M.A., Ph.D., 
Department of Physics, University College, Dundee. Communicated by 
Professor G. D. Preston. (With One Plate and Four Text-figures.) Issued 
separately August 23, 1952,. 

*23. Studies in Practical Mathematics. VII. On the Theory of Methods of 
Factorizing Polynomials by Iterated Division. By A. C. Aitken, D.Sc., 
F.R.S., Mathematical Institute, University of Edinburgh. Issued separately 
August 23, (952,. 


MAI 

>55 

174 

193 

232 

232 

242 

257 

268 

280 

290 

313 

326 








Contents 


vu 


N. PAOI 

*34- The Solution of a Functional Equation. By A. H. Read, M.A., United 
College, University of St Andrews. Communicated by Dr D. E. Rutherford. 
Issued separately August 23, 1952,.336 

*25. The First Chemical Society, the First Chemical Journal, and the Chemical 
Revolution. By James Kendall, M.A., D.Sc., LL.D., F.R.S., P.R.S.E. 

(With Two Plates.) Issued separately October 20, 1952, . -346 

*26. The Normal Penetration of a Thin Elastic-Plastic Plate by a Right Circular 
Cone. By J. W. Craggs, Ph.D., University College, Dundee. Com¬ 
municated by J. M. Jackson, Ph.D. (With Four Text-figures.) Issued 
separately October 20, 1952,.359 

*27. The Rotational Field behind a Bow Shock Wave in Axially Symmetric Flow 
using Relaxation Methods. By A. R. Mitchell, Ph.D., and Francis McCall, 

B.Sc., United College, University of St Andrews. Communicated by 
Dr D. E. Rutherford. (With Five Text-figures.) Issued separately 
October 20, 1952,.371 

*28. A Molecular Sum Rule. By D. ter Haar, Department of Natural Philosophy, 

University of St Andrews. Issued separately December 4, 1952, . . 381 

*29. The First Chemical Society, the First Chemical Journal, and the Chemical 
Revolution (Part II). By James Kendall, M.A., D.Sc., LL.D., F.R.S., 

P. R.S.E. Issued separately December 4, 1952,.385 

* The thanks of the Society are due to the Carnegie Trust for the Universities of 

Scotland for grants towards the costs of the illustrations, tables, etc. in these papers 





PROCEEDINGS 


or THB 

ROYAL SOCIETY OF EDINBURGH 


Section A (Mathematical and Physical Sciences) 

VOLUME LXIII 

I. —The Adventures of an Hypothesis.* By James Kendall, 

M.A., D.Sc., F.R.S., P.R.S.E. (With Two Text-figures) 

(Address of the President at a Meeting held on December 5, 1949) 

(MS. received October 27, 1949) 

i. Origin and Rise 

In the year 1815 an anonymous article appeared in Thomas Thomson’s 
Annals of Philosophy (1) entitled “On the Relation between the Specific 
Gravities of Bodies in their Gaseous State and the Weights of their Atoms 
Its introductory paragraph illustrates the hesitancy of the writer in its 
exposition: “The author of the following essay submits it to the public 
with the greatest diffidence; for although he has taken the utmost pains to 
arrive at the truth, yet he has not that confidence in his abilities as an 
experimentalist as to induce him to dictate to others far superior to himself 
in chemical acquirements and fame. He trusts, however, that its import¬ 
ance will be seen, and that some one will undertake to examine it, and thus 
verify or refute its conclusions. If these should be proved erroneous, 
still new facts may be brought to light, or old ones better established, by 
the investigation; but if they should be verified, a new and interesting 
light will be thrown upon the whole science of chemistry.” 

The general conclusion reached from the experimental work that 
constitutes the main body of the paper—the results of a great many other 
investigators are also collected in four tables—is rather indefinitely and 
tentatively formulated, but it is evident what the author wishes to suggest 

* Assisted in publication by a grant from the Carnegie Trust for the Universities 
of Scotland. This Address will also be published in Proceedings B. 

P.R.SJI.—VOL. LXIII, A, 1949-50, PAST I I 



2 


Janus Kendall 


he has proved, namely, that the atomic weights of all other elements are 
exact multiples of the atomic weight of the lightest element, hydrogen. 
Examining his data critically, one has grave doubts whether this conclusion 
was really justified. True, the work is of higher accuracy than that of 
Dalton, whose values for carbon and oxygen differed by more than ten 
per cent, from the correct figures when he stated in 1810 (2): “An atom 
of carbonic oxide consists then of one of carbone or charcoal, weighing 
5*4, and one of oxygen, weighing 7.” (These figures must be doubled to 
bring them into correspondence with the basis hydrogen = 1.) Dalton 
established his atomic theory on a very inadequate experimental foundation, 
and Thomson's anonymous correspondent followed Dalton's example in 
drawing inspired deductions from insecure premises. He finds, for 
instance, that the specific gravity of oxygen is just t6 times that of hydrogen, 
and the specific gravity of nitrogen just 14 times, but he arrives at these 
values by treating atmospheric air as a compound constituted by bulk of 
four volumes of nitrogen and one volume of oxygen t Consider, moreover, 
his remark: “There is every reason for concluding that the specific gravity 
of chlorine does not differ much from 2-5 [atmospheric air being 1*000]. 
On this supposition, the specific gravity of chlorine will be found exactly 
36 times that of hydrogen.” No element, it is obvious, could fail to fit into 
a scheme of such flexibility. 

Thomson, however, was a firm believer from the beginning. In an 
account of the improvements in physical Science during the year 1815 he 
wrote (3): “A very important paper was published in a late number of the 
Annals of Philosophy on this subject [the atomic theory]. Though the 
paper in question is anonymous, several circumstances enable me to fix 
with considerable certainty on the author; but as he chuses to remain for 
the present concealed, I do not consider myself as at liberty to mention 
his name.” Thomson had already been engaged for several years in 
atomic weight determinations, and he now embarked enthusiastically on 
a comprehensive series of experimental researches designed to establish 
the validity of the idea advanced by his contributor. 

This same contributor broke into print again also, early in 1816, with 
a second anonymous article in the Annals of Philosophy (4). It was only 
three pages in length, and it purported to be merely the correction of a 
mistake in the previous paper ("an oversight which influences some of the 
numbers in the third table”), but its sting lay in its tail: “If the views we 
have ventured to advance be correct, we may almost consider the rpwrti v\q 
of the ancients to be realised in hydrogen.” In other words, not only are 
the atomic weights of heavier elements exact multiples of that of hydrogen, 
but the atoms of heavier elements are to be regarded as aggregates of 



3 


The Adventures of an Hypothesis 

hydrogen atoms. Hydrogen is thus promoted to the rfile of the primordial 
substance from which, according to the Greek philosophers, all other 
material substances in the universe are derived. A few months later 
Thomson (5) disclosed the fact that this bold speculation emanated from 
the brain of Dr William Prout. It has established itself in the history of 
science as Front's Hypothesis. 

Who was Dr William Prout? He was born on 15th January 1785 at 
Horton, in Gloucestershire, where his family had been landed proprietors 
for several generations. His early education was neglected but, like many 
other Englishmen before and since, he remedied the deficiency by betaking 
himself to the University of Edinburgh, where he obtained the degree of 
M.D. in 1811. After graduation he carried on a medical practice in 
London, devoting his spare time to research. He became a pioneer 
investigator in the field of physiological chemistry, and was elected a 
Fellow of the Royal Society of London in 1819. He contributed thirty- 
four papers in all, mostly upon medical subjects, to various journals. In 
his later years he became, owing to deafness, somewhat of a recluse. He 
died on 9th April 1850. 

One of Prout’s discoveries in medicine—the presence, in the gastric 
fluid, of free hydrochloric acid, a most important factor in digestion—was 
of primary significance, but his real claim to fame rests upon the remarkable 
hypothesis of the unity of matter which he proposed with such a combina¬ 
tion of diffidence and audacity at the early age of thirty. Though he took 
no active part in its further development, the pious hope which he expressed 
that others, by testing its truth, would throw a new and interesting light 
on the whole science of chemistry has been amply fulfilled. 

In Great Britain the majority of chemists, following the lead of Thomas 
Thomson of Glasgow, accepted the hypothesis of Prout as absolute truth, 
Dalton, however, held aloof: “ No man can split an atom’* was his dogma, 
and he could not be induced to abandon it. On the Continent, opinion 
was generally adverse. Berzelius, the last dictator of chemistry, had 
decided from his researches that no simple relation existed between the 
weights of atoms of different elements, and as long as Berzelius lived that 
view almost universally prevailed. Everywhere, however, the urge to 
establish more accurate values for atomic weights, and so put the hypothesis 
to strict trial, led to rapid improvements in analytical technique, and data 
of greater reliability steadily accumulated. 

2. Decline and Fall 

For a long time the bulk of the evidence continued favourable. Turner 
(6) in 1833 and Penny (7) in 1839 did obtain results which led them to 



4 James Kendall 

conclude that the hypothesis was not exact, but the disagreement of 
certain of their own values weakened their objections. Dumas and 
Stas (8), from a very careful synthesis of carbon dioxide, deduced in 1841 
a figure for the atomic weight of carbon that agreed perfectly. New 
syntheses of water by Dumas (9) in 1843, confirmed by the work of Erdmann 
and Marchand, were also in complete accordance. 

Simultaneously, however, Marignac (10, 11) published the results of a 
whole series of researches undertaken for the purpose of submitting Prout’s 
Law (it was now beginning to be granted this superior status) to a new and 
rigid examination. Marignac declared himself to be of the firm opinion 
that, taking into account the extreme difficulty of attaining experimental 
values of absolute exactness, we cannot consider as contrary to Prout’s Law 
his own figures, determined to three places of decimals, for silver, potassium, 
bromine, iodine and nitrogen. Chlorine, however, presented a clear 
exception; its atomic weight, 35*456, might conceivably be rounded off to 
33-5, but an integral value was absolutely excluded. To rescue Prout’s 
Law in this emergency, Marignac suggested the adoption of a basic unit 
of 0*5, namely half the weight of the hydrogen atom. 

It is of interest to note that Prout himself had anticipated this suggestion. 
In a letter to Daubeny (12) in 1831, he mentioned that he saw no reason 
why bodies still lower in the «cale than hydrogen might not exist, of which 
other bodies might be multiples, without being actually multiples of the 
intermediate hydrogen. The substitution of a totally unknown element 
for hydrogen as primordial matter was a device, however, which devalued 
Prout’s Hypothesis very drastically. When in 1859 Dumas (13), to account 
for further discrepancies that had become apparent, proposed a reduction 
of the standard to 0*25, it was felt that such a procedure might be continued 
indefinitely, depriving the hypothesis of any permanent significance. 

The supreme crisis, resulting in what looked like complete collapse, 
came in i860, when Dumas’ former collaborator, Jean Servais Stas (14), 
submitted an impressive memoir to the Royal Academy of Belgium, 
summarising the results of twenty years' researches on atomic weights. 
What regret he experienced in communicating these results may be 
gauged from the following extract from a conversation that he had many 
years later with a young compatriot, Leo Hendrick Baekeland (15): 
“ In my youth I was an ardent believer in the unity of matter as expounded 
by Prout. I was so well convinced about his theory that I became eager 
to furnish additional proofs by redetermining more accurately the atomic 
weights of those elements where the atomic weight numbers were not an 
even multiple of hydrogen. I simply imagined that more careful deter¬ 
minations would have eliminated these irregularities. But the more I 



s 


The Adventures of an Hypothesis 

worked, the more I perfected my methods, the more I eliminated any 
errors of experimentation, so much the more did my results contradict my 
dearest hopes. Finally I had to admit that I was beaten and had spent 
the most important part of my life in killing my first love as a theory.” 

It must be granted that Stas wielded the axe very sympathetically. 
He accorded eloquent homage to the rare penetration of Prout, and 
emphasised the immense importance of his idea of the unity of matter 
from the philosophical point of view, but he proceeded to administer the 
coup degrdcc as follows: “All doubt has vanished from my mind. I have 
reached the complete conviction, the entire certainty, as far as certainty 
can be attained on such a subject, that Prout’s Law, with all the modifi¬ 
cations due to M. Dumas, is nothing but an illusion, a pure hypothesis 
expressly contradicted by experiment. Chemists after examining the 
work which I have the honour of now presenting in detailed analysis to 
the Academy, if they can cast away their prejudices and mental pre¬ 
possessions, and place their reliance on experiment, will soon share my 
conviction, namely, that there does not exist a common divisor for the 
weights of the elements which unite to form all definite compounds.” 

Chemists did, indeed, share the conviction of Stas, and Prout’s Hypo¬ 
thesis of the unity of matter was decently buried for the next forty years. 
There was just one final twitching of the corpse before interment. 
Marignac (i6), in a commentary upon the memoir of Stas, after paying 
tribute to his splendid experimental work, remarked: "It may perhaps be 
found astonishing that I do not entirely agree with the conclusions of 
M. Stas that we must consider Prout’s Law as a pure illusion and regard 
the undecomposable bodies of our globe as distinct entities. . . . While 
preserving the fundamental principle of that law, that is to say, while 
adopting the hypothesis of the unity of matter, could we not make for 
example the following supposition, to which I do not otherwise attach 
importance except as showing that we might explain the discordance that 
appears to exist between the results of observation and the immediate 
consequences of that principle? Could we not suppose that the cause 
(unknown but probably different from the physical and chemical agencies 
familiar to us) which has determined certain groupings of the atoms of the 
sole primordial matter so as to give rise to our chemical atoms, by impress¬ 
ing on each of these groups a special character and particular properties, 
should not at the same time have exercised an influence on the manner 
according to which these groups of primordial atoms would obey the law of 
universal attraction in such wise that the weight of each group might not 
be exactly the sum of the weights of the primordial atoms composing 
it?” 



6 


Janus Kendall 


In my own opinion, let me interpolate, this supposition of Marignac, 
to which he himself attached no intrinsic importance and which was 
entirely disregarded by his contemporaries since it contravened the funda¬ 
mental law of the conservation of mass, constitutes the most astounding 
episode in the whole narrative. Marignac actually, as will appear later, 
anticipated the discovery of the packing effect in the atomic nucleus 1 He 
made, however, merely a stab in the dark which it was impossible to 
follow up immediately; fifty years had to elapse before definite proof 
could be adduced that his supposition was valid. 

One indirect line of evidence, which was available during the interval, 
provoked sporadic speculation, but its real significance was never noted. 
Purely for convenience, chemists had gradually changed over from the 
basis hydrogen = i to the basis oxygen = 16 for the expression of atomic 
weights. Hydrogen combines directly with very few other elements, 
whereas oxygen combines directly with almost all, and the accurate 
determination of the hydrogen-oxygen ratio in water is a matter of extreme 
difficulty. It was therefore necessary, under the hydrogen standard, to 
change the atomic weight of nearly every element on the list each time some 
ingenious person redetermined the hydrogen-oxygen ratio a little more 
exactly. So long as hydrogen retained its r 61 e of primordial matter, this 
awkwardness might be endured, but once Prout’s Hypothesis was discarded, 
chemists naturally turned to the more rigid oxygen basis. 

The difference is only slight—on the scale oxygen = 16, hydrogen 
becomes 1-0078—but it entails a very curious consequence. An extra¬ 
ordinary proportion of the atomic weights, with oxygen = 16, fall so close 
to being integral that the nearest round numbers are exact enough for 
ordinary use. This is not the case when hydrogen = 1 is made the basis, 
as the following typical section of the periodic system, which includes all 
the elements from fluorine to chlorine, will demonstrate. 


Element Atomic Weight (0 = 16) Atomic Weight (H = 1) 


Fluorine 

19-00 

18-86 

Neon 

30-183 

20*030 

Sodium 

23997 

22-822 

Magnesium 

24-32 

24 * 13 

Aluminium 

26-97 

2676 

Silicon 

28-06 

2784 

Phosphorus 

30-98 

3075 

Sulphur 

32-06 


Chlorine 

35-457 

35-195 


Here is indeed an anomaly. By the abandonment of Prout’s Hypo¬ 
thesis the chemists came right back to another form of it, since it has been 



The A dventurts of an Hypothesis 


7 


calculated that the odds are billions to one against such an agglomeration 
of atomic weights in the immediate vicinity of integral values occurring 
merely by chance. There are still some flagrant exceptions, of course, 
such as magnesium and chlorine, but it is obvious that the atomic weight 
of hydrogen is essentially out of step with those of other elements. The 
logical deduction, however, that the mass of the hydrogen atom is altered 
by an approximately constant fraction when it is incorporated into heavier 
atoms, was never drawn. Marignac’s supposition, which afforded the 
key to the difficulty, had passed into oblivion. 


3. Resurgence 

The turn of the century brought much more cogent reasons for regarding 
chemical elements as possessing a common basis. Once it was established 
that uranium, by a sequence of radioactive disintegrations in which 
positively charged helium atoms and electrons were ejected, transmuted 
into other elements such as radium and lead, the idea of the unity of matter 
came again to the fore. The approaching centenary of Prout’s Hypothesis 
was fittingly celebrated by Rutherford's formulation in 1911 of the theory 
of the nuclear atom. 

According to this theory, the hydrogen atom consists of a minute 
positively charged nucleus or proton, responsible for almost its entire mass, 
round which rotates a negatively charged unit or electron, responsible for 
almost its entire volume. The atoms of all other elements comprise a 
steadily increasing number of exterior electrons, with nuclei that contain, 
besides an equal number of free protons, a number of proton-electron 
pairs, “collapsed hydrogen atoms” or neutrons. All heavier atoms 
are therefore, essentially, aggregates of hydrogen atoms, and Prout’s 
Hypothesis is rehabilitated. 

Two points demand discussion: how can the abnormality in the 
atomic weight of hydrogen be explained, and how can other recalcitrant 
elements such as chlorine be brought into line? The answer to the 
first question has already been hinted at above, and will be developed 
in detail later. The solution of the second problem may be very briefly 
indicated. 

Chlorine, with an atomic weight of 35*457, has been proved to be a 
mixture of two types of atoms, with masses 35 and 37 respectively. Their 
structures are: **C 1 = 17 exterior electrons; 17 free protons and 18 
neutrons in the nucleus; ”C 1 = 17 exterior electrons; 17 free protons and 
20 neutrons in the nucleus. All naturally occurring chlorine contains the 
two types in the ratio required to make the apparently abnormal atomic 



8 


Janus Ktndall 

weight the average of two normal aggregates. Many other elements are 
similarly isotopic —even the two standards, hydrogen and oxygen, contain 
minute traces of heavier atoms—but all the isotopic varieties of every 
element fit perfectly into Rutherford’s scheme. 

It was thought at first that all atomic masses might be fundamentally 
integral on the oxygen = 16 basis, with hydrogen (1-0078) as the sole 
exception. More recent determinations, which have now reached the 
phenomenal accuracy of five places of decimals, have shown that this is not 
the case—there is a packing factor in the nucleus inducing small, regularly 
varying deviations. 



0 JO 40 «0 SO 100 1*0 MO WO ISO 200 

Mass Number 

Kio. 1.—Variation of packing effect with atomic mau. 


Inside the very minute nuclei of all atoms except the simplest, hydrogen, 
protons and neutrons are packed so closely together that their electro¬ 
magnetic fields interfere and a fraction of the combined mass is destroyed. 
The helium atom, for example, contains the same total material as four 
hydrogen atoms, but its mass is not 4X 1-0078=4-0312; it is only 4-003. 
The proportionate influence of this packing effect on atomic mass for still 
more complex nuclei is almost constant, but not exactly so. This may 
best be illustrated by comparing all atomic masses with that of the mass- 
spectrograph standard, 1# 0 . Fig. 1 shows the curve obtained when mass 
number (the total number of protons in the nucleus) is plotted against what 
Aston has called the packing fraction (the main gain or loss in mass per 
•proton when the nuclear packing is changed from that of 1C 0 to that of 
each other atomic species). 

Several points of interest emerge from this curve. The initial, steeply 
descending sections (the curve splits into two distinct branches for the 




The Adventures of an Hypothesis 


9 


earlier elements, following those of odd and even atomic number respec¬ 
tively) indicate that the masses of atoms lighter than w O are slightly 
higher —in the single case of hydrogen significantly higher—than whole 
numbers. Through a long range of elements in the central part of the 
diagram the packing fraction is negative, so that atomic masses in this 
interval are slightly lower than whole numbers. In the region of the rare 
earths the curve crosses the zero line once more, and atomic masses are 
again above integral values. A more accurate graph, on a larger scale, 
for these latter sections is given in fig. 2. 



Elucidation of these variations is obtained by the application of the 
theory of relativity. What we have lost in mass through nuclear packing 
we have gained in energy. Mass and energy are not distinct phenomena, 
but interconvertible; matter is potential energy, energy is potential matter. 
The law of conservation of mass holds within our limits of weighing, for all 
ordinary chemical reactions, only because the energy changes involved 
are extremely small. Energy changes in reactions "within the atom", 
however, first demonstrated in radioactive disintegrations, may be 
enormous. Correlating our old laws of conservation of mass and 
conservation of energy into one wider law— conservation of mass-energy — 
we can calculate what change in mass corresponds to any particular 
energy change, and vice versa . We find that the mass change in ordinary 
chemical reactions is infinitesimal, and that even in radioactive disinte¬ 
grations it is still minute. Conversely, if the mass change is appreciable, 




10 


Janus Ktndall 

as in the case of hydrogen-helium, the energy change involved in the 
rearrangement of the protons and electrons must be stupendous. 

It is very significant that the minimum in the packing fraction curve 
lies in the neighbourhood of iron, one of the most abundant elements in 
nature. Atoms in the region of this minimum should be most stable, 
whereas atoms at the two extremes of the diagram should be susceptible 
to highly exothermic transformations to other types. Such transforma¬ 
tions, in the case of normal radioactive disintegrations, carry us only 
short distances along the curve at the right of fig. 2. A much longer 
jump, however, has recently been achieved by the fission of uranium, 
which renders the utilisation of atomic energy possible in peace as well as 
in war. Atomic fusion of the lighter elements would obviously release 
even greater amounts of energy per unit weight of matter transmuted. 
This has not yet been effected on a major scale on our planet (although, by 
bombarding a target of lithium with protons of 700,000 electron volts 
energy, Cockroft and Walton (17) induced the reaction, 7 Li+ 1 H—► 
*He + 4 Hc, where the energy of the ejected a-partides exceeds 8,000,000 
electron volts, as early as 1932), but a series of reactions in which the l8 C 
atom functions as a catalyst for the transformation of hydrogen into 
helium (18) is regarded by astrophysicists as offering the first adequate 
explanation for the source of the heat of the sun throughout the lifetime 
of our solar system. 


4. A Vision of the Future 

It may seem that I have strayed from Prout’s Hypothesis in the pre¬ 
ceding paragraphs, but I have expatiated on the topic of energy because I 
wish to conclude this address by presenting for your consideration a 
comprehensive extension of Prout’s Hypothesis in which the fundamental 
fact that matter and energy are now practically synonymous is essentially 
involved. I shall restrict my exposition to the barest outline, since I am 
not qualified to delineate it in detail, and I leave it to my colleagues in 
biology, medicine and philosophy to elaborate the evidence on sections 
which I may cover inadequately or omit entirely, and to correct any errors 
that I may inadvertently make. 

Let me first recapitulate the present position with respect to matter. 
Prout's first postulate, that the atomic weights of all other elements are 
exact multiples of that of hydrogen, has had to be discarded, but a perfectly 
logical explanation for all the divergences therefrom is now available. His 
second and more significant postulate, the unity of matter, has been 
triumphantly resuscitated in Rutherford’s picture of the nuclear atom in 
terms of protons and electrons. (Complexities introduced by the discovery 



II 


The Adventures of an Hypothesis 

of transitory and intermediate units, such as mesons, positrons, neutrinos, 
need not be entered into here.) 

Next let me return for a moment to energy. The interconvertibility 
of all forms of matter has only recently been indubitably proved, but the 
interconvertibility of all forms of energy was established more than a 
hundred years ago. In 1900 Planck advanced the hypothesis that, in the 
same way as matter was made up of discrete particles or atoms, so energy 
also existed only in discrete indivisible units or quanta. The application 
of Planck’s quantum theory has been just as fruitful in its results as 
was the application of Dalton’s atomic theory. In consequence of the 
4 ‘atomicity” of energy, the electron orbits in different atoms are restricted 
to definite energy-levels, which in the simpler cases can be calculated to 
correspond exactly with experiment. 

The mass of the atom varies according to the number of protons and 
electrons in its composition; the magnitude of the quantum c is expressed 
by the simple equation c = hv t where A is a universal constant (Planck’s 
constant), and v is the frequency (the reciprocal of the wave-length) of 
the energy under consideration. Quanta therefore decrease enormously 
in magnitude as we proceed from energy of short wave-length (X-rays) 
through the visible spectrum to energy of long wave-length (radio com¬ 
munication). The variation differs from that of atoms only in its 
continuity. 

Apart from this detail, matter and energy stand now on precisely the 
same footing: the unity of matter is duplicated by the unity of energy. 
Matter and energy, as described in the preceding section, are also mutually 
interconvertible, a fact expressed in what we may truly regard as the 
modem form of Prout’s Hypothesis, the law of the conservation of mass- 
energy with the implications attendant thereupon (see p. 9). 

Now, besides matter and energy, there is a third principle which has 
been recognised as existent in the universe, a principle just as varied in its 
manifestation as matter and energy themselves, a principle which the 
philosophers called mind, but which wc may amplify here to the still more 
general concept of life. 

So long as matter and energy were regarded as distinct phenomena 
there was no reason at all to surmise that life had any fundamental corre¬ 
lation with either, but now that matter and energy have been completely 
integrated, it is surely opportune to analyse an hypothesis that vital 
phenomena are also amenable to inclusion in the same category. Such 
an analysis, indeed, I propose to make, with Prout’s Law (it well merits 
that title again to-day) as my guide. Perhaps I can best make clear the 
idea that I wish to convey by considering what conclusions scientists of 



12 


Janus Kendall 

different periods in the past might logically have reached if they had 
attempted a similar analysis, with the evidence then at their disposal, 
before I develop what may be inferred from present knowledge. 

Previous to Prout, the line of reasoning would have been simple. 
Many elements were known, all quite distinct from one another, quite 
unsusceptible to change. By analogy, the many forms of life might be 
assumed to be absolutely unrelated, immutable from the time of the 
creation. Lavoisier’s statement of the indestructibility of matter would 
be paralleled by the statement of the immortality of the soul. Everything, 
in short, would be in complete accordance with orthodox views. If 
energy were also brought into consideration, Rumford’s experiment on 
the conversion of mechanical to thermal energy might have been deemed 
disturbing; but too little was known at that time about energy to be 
of any real value. 

After the advent of Prout’s Hypothesis, however, the argument would 
have run very differently. All the many varieties of matter are funda¬ 
mentally related; consequently the assumption would follow, by analogy, 
that all the many varieties of life are similarly related. Hydrogen, the 
simplest and primordial form of matter, from which all other forms of 
matter are derived, would be paralleled by the single cell, the simplest and 
primordial form of life, from which all other forms of life are derived. This 
basis for a unitary conception would have been strengthened, around the 
middle of the nineteenth century, by the recognition of the interconverti¬ 
bility of all forms of energy. Had biology been a little further advanced, 
in fact, the line of reasoning would have accorded exactly with that of the 
early evolutionists, culminating in Herbert Spencer’s philosophical doctrine 
of general development in the universe from homogeneous and simple to 
heterogeneous and complex. 

After i860, any aspiring co-ordinators would have found themselves 
in a quandary. With the downfall of Prout’s Hypothesis, the various 
forms of matter were once more regarded as essentially distinct and 
incommutable. Interconvertibility of the various forms of energy, how¬ 
ever, remained uncontested. It is obvious that no conclusions, by analogy, 
regarding the various forms of life could possibly be reached. 

Our present basis for inquiry, to which I finally revert, is vastly more 
promising. Matter and energy have been not merely brought into line 
but amalgamated, providing an harmonious system in which it is plausible 
to postulate that life may also be incorporated. Carrying our analogy to 
the extreme would involve the hypothesis of the existence of a compre¬ 
hensive future law which might be termed the law of the conservation of 
mass-energy-life; but the enunciation of such an hypothesis at present 



The Adventures of an Hypothesis 


13 


would be altogether premature, since its implications would be far too' 
indefinite and there would be no means of testing its validity. The reasons 
for our inadequacy may be explained very briefly. 

Although we know that all matter is, by ultimate analysis, reducible 
to energy, it is only in the case of a few of the simplest forms of matter that 
we are able to express their properties quantitatively on an energy basis 
(see p. 11). With increasing complexity of the atom, most relationships— 
X-ray spectra provide a welcome exception—soon become too intricate for 
calculation. Much more is this so when we proceed to compounds, 
particularly to large molecules. While we recognise that constitutionally 
these are still ultimately reducible to energy, wc recognise also the utter 
impossibility of establishing their properties purely in terms of energy 
until enormous advances have been made in our present knowledge 
We continue, therefore, to investigate the “properties of matter” and to 
consider them as such in the meantime. 

If now we postulate, by analogy, that all vital phenomena may be 
ultimately expressed in terms of matter and energy, we immediately find 
ourselves in a similar predicament. These phenomena relate to units so 
complex that the largest synthetic organic molecules are simple in com¬ 
parison. Consequently, although vital processes may be successfully 
described in physico-chemical terms up to a certain point, we soon reach 
limits where explanation on such a basis becomes incomplete or quite 
impracticable. Pending further progress in our knowledge which will 
push forward these limits, therefore, wc continue to investigate the 
“properties of living material” and to consider the processes involved as 
“life processes”. 

The gap here is comparable, indeed, with that which existed between 
matter and energy in 1860. The physicists of the present century gradually 
narrowed that gap until it has been securely bridged, but even if a second 
Marignac were to hazard an inspired guess as to how the link of matter 
and energy with life might be fully established, we should be as little 
qualified to interpret or verify it as the contemporaries of Marignac were 
to interpret or verify his prognostication of the packing effect (p. 6). 

Within restricted bounds, nevertheless, I believe that the analogy that 
I have postulated does enable us to formulate deductions of immediate 
interest. A brief, and necessarily incomplete, presentation of one such 
topic is attempted in the final section of this address. 

5. A Present Glimpse of the Vision of the Future 

In a myriad different species of living bodies we can definitely dis¬ 
tinguish a material factor, the number and nature of their respective cell 



14 


Janus Kendall 


chromosomes, reminiscent on a larger scale of the arrangements of atoms 
and groups of atoms in a myriad different complex chemical compounds. 
Associated with this material factor so long as life persists, an energy factor 
is demanded, which used to be designated vital energy. It is now generally 
recognised (19) that this is not inherently distinct from other forms of 
energy in any of its many manifestations susceptible to experimental 
investigation, ranging from mechanical and thermal energy to the electrical 
discharges along nerve-fibres (thought impulses) established by recent 
studies in electro-encephalography. Whether this association of matter 
and energy is sufficient to account for more abstract features of vital 
phenomena (e.g. f instinct, intelligence, reasoning, etc.) is a highly contro¬ 
versial question which lies completely beyond the bounds of present inquiry. 

For the continuance of life in any specimen thereof, the maintenance 
of stable conditions among the material and energy factors is essential; 
these conditions become more complex the greater the complexity of the 
organism. Some of the limitations temperature range, chemical 
environment) have been definitely determined; with respect to others, 
however, our knowledge remains very elementary. A typical example 
of the present stage of experimental progress on one vital problem—the 
propagation of living tissue in vitro —may be cited. While it has been 
found possible to maintain simple tissues, such as chicken embryo tissue, 
growing continuously after removal from the body for decades in carefully 
adjusted synthetic surroundings, the conditions favouring and disfavouring 
the persistence of more complex aggregates provide innumerable problems 
still in the initial stages of attack. 

One most important fact, nevertheless, has been experimentally 
proved. Many independent investigations have shown that mutations 
in living species can be artificially induced both by drastic change in energy 
environment (exposure to short-wave irradiation) and by drastic change in 
material environment (exposure to chemical action). Mention may be 
made, in particular, of the experiments of Muller (20) with X-rays and of 
Auerbach and Robson (21) with mustard gas. These investigations 
demonstrate beyond doubt that by means of matter and by means of 
energy we can change the character of the chromosomes, or of the genes 
that they carry, and so vary the form of life. The changes in the chromo¬ 
somes are often, though probably not always, visible, and are structural 
in nature, consisting of inversions, translocations and deletions. Bacteria, 
unicellular organisms with an internal structure so delicate that they do 
not possess the obvious chromosomes of higher forms of life, also exhibit 
changed properties on exposure to drugs or to ultra-violet light. 

This convertibility of one form of life to another by means of matter 



The Adventures of an Hypothesis 15 

and energy, limited as it is at the moment, offers wide possibilities for 
future developments. Reasoning by pure analogy from similar develop¬ 
ments in the fields of matter and energy in the past, I submit now what 1 
believe to be a legitimate hypothesis on the basis of current knowledge— 
the co-ordination of constitutional changes in mass-energy-life. 

Let us consider the cell as our unit, analogous to a complex molecule, 
and consider the chromosomes as corresponding to constituent groups 
thereof. On this basis, if parallelism does prevail, the various types 
of mutagenic reactions should duplicate the various types of chemical 
reactions. The following classification suggests that this is substantially 
true:— 

(a) The majority of spontaneous mutations, resulting in the gradual 
evolution of new species, may be regarded as equivalent to ordinary 
chemical changes with very low reaction rates. (Any rapid reaction, it 
may be noted, would have been long ago completed.) Like ordinary 
chemical reactions, these spontaneous mutations should (and do) exhibit 
large temperature coefficients. 

(£) A small fraction of spontaneous mutations, induced by the action 
of stray radiations, may be regarded as equivalent to photo-chemical 
reactions. As such, they should not normally exhibit any temperature 
coefficient. This question cannot be tested directly, but its presumable 
validity is indicated in ( d) below. 

(/) Artificial mutations induced by chemical agents such as mustard 
gas may be regarded as equivalent to catalytic reactions. The mustard 
gas or other agent functions as a catalyst, increasing the natural rate of 
reaction very significantly. Spontaneous mutations may possibly be 
due, in whole or part, to the natural occurrence of minute amounts of such 
catalysts in living material. Whether this is so or not, chemically induced 
mutations should show a large temperature coefficient. This important 
point has not yet been investigated experimentally. 

( d) Artificial mutations induced by irradiation may be regarded as 
equivalent to photo-catalytic reactions. Their rate is proportional to the 
intensity of the radiation, which is here enormously increased over that in 
( b ) above. They should not, as photo-catalytic reactions, exhibit any 
temperature coefficient (although subsequent chemical reactions may 
introduce a small temperature factor), and experiment shows that this is 
in fact the case. 

Most of the changes resulting from reactions of the above types, 
inasmuch as they affect only a small part of a complicated "molecule”, 
may be expected to be limited in their influence on the properties of the 
organisms as a whole. New forms of life manufactured thereby, as in the 



16 James Kendall 

case of new strains of Penicillium chrysogenum obtained by X-ray irradia¬ 
tion, will therefore in general exhibit only minor differences from their 
natural source. Major changes may be anticipated to be rare, and in 
most cases to be harmful, recessive, or lethal. 

Nevertheless, occasional modifications of extensive character must be 
produced, which are capable of survival and persistence. These more 
significant redistributions within the cell, necessary for the emergence of 
essentially new species, presumably all have had a purely chemical basis. 
In such evolutionary developments, both "chromosome fissions” and 
"chromosome fusions" (compare p. io) of an abnormal type probably 
play an important part. 

What I have put forward for your consideration is at present, of course, 
merely an hypothesis, and by no means an entirely novel hypothesis. I 
trust that you will not think me presumptuous if I dare to cite a precedent. 
Prout himself, when he first ventured to say in 1816, “We may almost 
consider the x/wrrij i/Xij of the ancients to be realised in hydrogen”, 
followed this up immediately with the words: "an opinion, by the by, not 
altogether new.” This warrants the assumption that the idea was "in the 
air”, and vague foreshadowings may be recognised as existing, in more or 
less hazy form, in certain passages in the writings of Dalton, Davy and 
Thomas Thomson (22). In the same way, views on the atomicity of the 
soul were expressed by the Greek philosophers, and the equation of vital 
processes with matter and energy has been so frequently mooted, more or 
less empirically, in recent years, that many scientists accept it as a truism 
even beyond present experimental limits. As far as I know, however— 
I speak open to correction—the numerous arguments advanced and 
inferences drawn have never been reinforced by the compelling analogy 
of the historical evolution of the Law of Prout. 

I have been forced, like Prout, to be indefinite and tentative, and I 
should be glad if you would permit me to turn back to the first page of this 
paper and repeat the first paragraph of his first communication in full. It 
is quite probable that, in far less than fifty years' time, some second Jean 
Servais Stas will, on the basis of much wider and more exact evidence, 
pronounce his conviction that even the co-ordination of mutations in life 
with mutations in matter and energy is an illusion, a pure hypothesis 
expressly contradicted by experiment. But I trust I may be pardoned 
for not excluding the possibility that, still later, some second Rutherford 
will arise to put into concrete form a universal law (hinted at on p. 12) 
which will transcend my partial exposition more brilliantly than our 
present Law of Prout transcends the hypothesis which he so hesitantly 
advanced in 1816. 



The Adventures of an Hypothesis 


17 


Much of the material in the earlier part of this address is taken from 
the historical introduction to Alembic Club Reprint, No. 20 (E. & S. 
Livingstone, Edinburgh), which reproduces under the title Prout 9 s Hypo - 
thesis the original papers of Prout, Stas and Marignac. Since several 
reviewers have regretted the fact that this historical introduction was 
unsigned, it may be proper for me to disclose now that it was the joint 
effort of Dr Leonard Dobbin (the senior Fellow of this Society, still 
scientifically active in the sixty-ninth year of his membership and the 
ninety-second year of his age) and myself. My sincere thanks are due to 
Dr Dobbin for his contribution to this odyssey of the Hypothesis of Prout. 


REFERENCES TO LITERATURE 

(i) [Prout], 1815. Annals of Philosophy , vi, 321-330. 

(а) Dalton, 1810. New System of Chemical Philosophy , Part II, 377. 

(3) Thomson, 1816. Annals of Philosophy , vn, 17. 

(4) [Prout], 1816. Annals of Philosophy , vu, m-113. 

(5) Thomson, 1816. Annals of Philosophy , vii, 343. 

(б) Turner, 1833. Phil. Trans., exxm, 523-544. 

(7) Penny, 1839. Phil. Trans., cxxix, 13-33. 

(8) Dumas and Stas, 1841. Ann. chim.phys. [3], 1, 5-38. 

(9) Dumas, 1843. Ann. chim.phys . [3], vm, 189-207. 

(ro) Marignac, 1842. CJ xiv, 570-573. 

(11) Marignac, 1843. Bibl. Univ xlvi, 350-377. 

(12) Daubeny, 1831. Introduction to the Atomic Theory , 129-130. 

(13) Dumas, 1859. Ann . chim.phys. [3], lv, 129-aio. 

(14) Stas, i860. Bull. Acad. Roy. Belg. [a], x, 308-336. 

(15) Kendall, 1949. Chemistry and Industry (Baekeland Memorial Lecture), 

xxvii, 67-70. 

(16) Marignac, i860. Bibl. Univ., ix, 97-107. 

{17) Cockroft and Walton, 1932. Proc. Roy. Soc. London, A, lxxxvi, 619. 

(18) Bethe, 1939. Phys. Rev., lv, 434. 

(19) Haldane, 1949. Philosophy for the Future, The Macmillan Company, 204. 

(20) Muller, 1927. Science, lxvi, 84. 

(21) Auerbach and Robson, 1947. Proc. Roy. Soc. Edin B v lxii, 271, 284, 

3 <> 7 - 

(22) [Dobbin], 1932. Alembic Club Reprint, No, 20 (E. & S. Livingstone, 

Edinburgh), 5. 


(Issued separately May 17, 1950) 


P.R.S.E.—VOL. UIII, A, 1949-50, PART Z 


3 



i8 


E. M. Wright , The Stability of 


II.—The Stability of Solutions of Non-linear Difference- 

differential Equations.* By Professor E. M. Wright, 

University of Aberdeen. 

(MS. received November 23, 1948. Read February 7, 1949) 

Synopsis 

Poincart, Liapounoff, Perron and others have proved theorems about the order of 
smallness, as the independent variable tends to + w, of solutions of differential equations 
with non-linear perturbation terms. A similar theory exists for difference equations. 
By a simple use of transforms, we here extend the theorems, with suitable modifications, 
to difference-differential equations. The results are an essential step in the development 
of a general theory of non-linear equations of this type. 

I. A DIFFERENCE-DIFFERENTIAL equation is one involving some or all 
of the functions 

d? 

+ + (o<fj r <m 9 o <v <n) 

of a real variable x, where m t n are positive whole numbers and 

o-^o < b x < . . . < b m . 

As “boundary conditions’* we suppose assigned the value of y\ o) for 
o < v < n and of y {n) (x) in the initial interval o < x < b mt y* n \x) being 
of integrable square in this interval. We regard our equation as an 
integral equation in the unknown function y {n) (x) and define y { 9 \x ) 
for v < n by 

(1.1) 

We say that a solution exists for o < x < X if there is an integrable 
y«>(*) which satisfies the equation for these values of x. In what follows 
any statement involving fi or v holds, unless the contrary is stated, for 
all integral p, v such that o< p< m, o<v<n. We use 2 and 2 
to denote summation over these ranges. ' ’ 

Several authors have discussed the linear equation with constant 
coefficients, viz. 

A 0 -) s 2 ,(i.a) 

ft 9 

* This paper was assisted in publication by a grant from the Carnegie Trust for the 
Universities of Scotland. 



Solutions of Non-linear Difference-differential Equations ig 

where v(x) is a-known function. References are given in Wright (1949 a) 
to their work and to applications of special cases of (1.2) to economics, 
radiology and the theory of control mechanisms. A particularly important 
case of (1.2) is the homogeneous equation 

AOO-o, (1.3) 

in which v(x)=o. It is readily verified that y(x) —e‘ rX is a solution of 
(1.3), provided s T is a root of the transcendental equation 

t(j) s 2 2 " °* (M) 

* r 

In fact, under fairly wide conditions, the general solution of (1.3) is 

•r 

the summation extending over all the zeroes of t(s). Here Pfx) is a 
polynomial of degree one less than the order of the zero s r of t(s), and the 
coefficients in P r (x ) can be expressed fairly simply in terms of the 
" boundary conditions". 

If we replace the fixed numbers a^ in (1.2) by functions A^x) 
tending to a ^ 9 as .r->+<*>, we have a "perturbed" equation. The 
relationship between the solutions of this new equation and those of (1.2) 
was found in Wright (1948), which contains references to work by 
Poincar£, Perron and Bochner on the corresponding problem for pure 
differential and pure difference equations. 

Here I consider another form of perturbed equation, viz. that obtained 
by adding to the left-hand side of (1.3) a non-linear function of the 
+ for v < », this function being of smaller order than the linear 
terms when these last arc themselves small. Under suitable conditions 
I prove that, if the a M9 are such that all solutions of (1.3) tend to zero 
as x -► + ®, and if the values of y (v} given in the boundary conditions 
are sufficiently small, then the y {9) in the non-linear equation remain 
small and, in fact, tend to zero as x-++ ®. Poincari (1892), Liapounoff 
(1907)1 Perron (1928, 1929) and many others have found similar results 
for pure differential and pure difference equations. Bellman (1947) is 
the latest to write on the topic and gives further references. 

An obvious solution of the equation which I consider here is the 
zero solution y(x)=o. My present result is equivalent to the statement 
that, under suitable conditions, the zero solution is stable; that is, a 
solution initially small enough approaches the zero solution as x -*■ + <©. 

In Wright (1946) I stated that part of the theory of the non-linear 
equation could be developed in three stages. In the first we seek initial 



20 


E. M. Wright y The Stability of 

conditions which ensure that a solution tends to zero as x -► + «. In 
the second we show under suitable hypotheses that every solution tending 
to zero is for some c > o. In the third, with which I was then 

concerned, we find an asymptotic expansion for a solution which 
is 0 (e~ em ). In the present paper, then, I solve the first stage (as it 
happens, the first and second stages together) for a fairly wide class of 
equations. 

There are at least two methods available to investigate problems of 
this type. One of these is based on certain results of Pitt (1944, 1947)* 
Elsewhere (Wright, 1949 3 ) I apply it to consider the stability of solution 
of a more general class of equations in which A (y) is replaced by 

2 fV’^-*)<»,(*) 

p Jo 

and the non-linear perturbation by an appropriate functional of y . By 
this means we can also find results corresponding to those of Cesari (1939), 
Levinson (1946) and Wcyl (1946) on the boundedness of solutions of 
differential equations. But it seems worth while to present here a short 
proof by another method (a little simpler if a little less powerful) for 
the case of the difference-differential equation. For the latter equation, 
the existence and uniqueness of a solution are trivial corollaries of well- 
known results in the theory of differential equations. For the more 
general functional equation these questions require investigation. 

Since I wrote this paper Dr R. Bellman has very kindly sent me a 
manuscript copy of a paper of his (Bellman, 1949) on a similar problem, 
viz. that in which A (y) contains y {n) (x + 6 m ) and no other Such an 

equation can be reduced to a set of n simultaneous equations containing 
no derivative of order higher than the first, and so he studies in detail 
only the latter type of equation. When A (y) contains more than oney n> , 
this reduction method is not applicable. 

2. We take a a fixed positive number and write w) for a continuous 
function of x and of the variables 

w !ap (o </* <m t o <v <*-i) 
throughout the region Jt 9 in which 

* > O, I W M , I < o. 

We suppose that, in R 9 whenever 

ZZKJc*; 

ft 



Solutions of Non-linear Difference-differential Equations 21 

then 

| w) | < (a.i) 

where x (^0 * s a bounded function of W alone for W < m(tt - i)a and 
such that 

. 

converges. 

In what follows K always denotes a positive number, not always 
the same at each occurrence, independent of x t X, £, y , 2> tv, t , c, 8 , but 
possibly depending on some or all of 

V c '» 4 a * (3.3) 

The constant implied in the 0 ( ) notation is a AT. Both t and 8 are 
positive numbers, to be thought of as small, the choice of 8 is subsequent 
to that of €. 

We shall consider the equation 

A(y)+^,jv)-o, (3.4) 

where A is the linear operator defined in (1.2) and *Kx, y) denotes the 
result obtained on putting w ttw =y {9) (x+ 6 ^) in i(t(x 9 w). We say that y 
satisfies a 8 -set of initial conditions if the usual set of initial conditions 
is such as to make 

| j y { 9 ) (o) | < 8, | y {n \x) | < 8 (o < * < b m ). 

By repeated use of (1.1) we see that these imply that 

\y<»(x)\<X$ (o<x< fij, (2.5) 

except possibly for y [n) (b m ). 

If a mn *o t the zeroes of t(j) have their real parts bounded above 
(see, for example, Lemma 2 (iii) of Wright 1949 a). Wc denote the 
least upper bound of these real parts by and shall here suppose 
that d > o. 

We shall prove the following 

Theorem .—If a mn * 0 , if o < c < d, and if iff(x 9 y) satisfies the above 
conditions , then 9 for every € > o, there exists a 8 > o such that any solution 
y of (2.4) satisfying 8 -set of initial conditions exists and satisfies 

\y'"{x) | < (y < n) (2.6) 

for all x > o. 

We may clearly always replace e by any smaller number without 
loss of generality, and we shall always suppose that e < Ja. In general 



22 


E . M. Wright, The Stability of 

the statement “for € < AT” or “for 8 < K M is to be understood, but 8 
is always chosen subsequently to and neither may be chosen to depend 
on x (or on X or on £). 

3. The question of existence is fairly trivial. We prove 

Lemma.— For c < K } there is a number 0 = j8(e) > o independent of 
X such that , if a solution of (24) exists and satisfies (2.6) for o < x < X y 
where X > b mt the solution exists for o < x < X + ) 3 . 

Let us suppose that X < x < X + b m -b m _ v If we put x-b m for x 
in (24) and write 

* 

we transform (24) into 

*<">(*)», .?)■ (3.1) 

In this equation *f / 1 is a function of x t of the u iw) (x) for which v < n and of 
the 

y r >(* + b M - b J (jx< m t v< n ). (3.2) 

Also, by the properties of 0 (;r, j/), provided each of these u and y functions 
is suitably bounded, is bounded uniformly with respect to x and X 
and is a continuous function of x % of the u ir) (x) with v < n and of the 
functions (3.2). 

Since x + b fi -b m < X for /x - w, each function of (3.2) has its 
modulus less than c, is continuous and may be regarded as known. We 
may then regard (3.1) as a differential equation in the unknown function 
u(x) l in which *fi l is a continuous function of x and of the u { *\x) with 
v < n, which is bounded uniformly with respect to x and X t provided 
the u { *\x) are suitably bounded for v <n. A standard result from the 
theory of differential equations (see, for example, Satz 2 of Perron, 
1918) then shows that u M (x) exists and is continuous for v < w, and 
X<x<X + fl < X + b n ~b m _ u for some positive £=/?(«) independent 
of X . By (3.1) the same is true for tt <fl) (2r). Now, by hypothesis, 
y n) (x + b H -b m ) exists and is integrable for /x < m t and so the same is 
true for 

w""(*) - « < " , (*) - 2, + K~ *«•). 

< m 

provided that X < x < X + fi. Since a mn * o, this is the result of the 
lemma. 

Now, for v < n and o < x < 6 m , y ir) (x) is given by (l.l) and the 
initial conditions independently of the value of y in} (i m ). The latter is 
then defined by (2.4) with x=o. Hence, for 8=8(e) sufficiently small, 



Solutions of Non-linear Difference-differential Equations 23 


the hypothesis of the lemma is satisfied for X=b m by any solution satis¬ 
fying a 8-set of initial conditions. Hence we may apply the lemma in 
successive steps. 

From this we see that, if our solution ceases to exist at any value 
of x , (2.6) must have been violated for some smaller positive value of x 
and for at least one value of v<n t say v = A. But y { 9 ) (x) (v < n) is 
continuous by (1.1), and so, if our theorem is false, there must be a 
number £ > b m and a A < n, such that 


|y x, <£) | -«-<* (3.3) 

and 


b’ w (*>|<*"* (o < * < £) ( 3 - 4 ) 


for all v < n. 

Our theorem does not assert that y is unique for all x > o. It is 
enough to ensure uniqueness if w) satisfies a Lipschitz condition of 
order I with respect to every w mt . 

4. We now assume (3.3) and (3.4) true for a particular £ and show 
that they lead to a contradiction. This will complete the proof of the 
theorem. 

We write 


v(x) 


~'K x >y) 

o (*>£-£«) 


and consider the linear equation 


A(*)-v(x) 


(4.1) 


in the unknown function e. If we suppose 2 to satisfy the initial conditions, 


-y>( 0 ) (v < tt), *<"»(*)-><"»(*) (o<x< ij, 


it follows from Theorem 1 of Wright (1949 a) that e(x) is uniquely defined 
for all x > o. If we put t(x) —y{x) for o < x < £, (4.1) is satisfied for 
o < x < £ - i m by (2.4). Hence 

-y l,) (x) (o < x < £). 

Next we observe that, since v(x)=o for x > £, 

j" v t (x)e- ta *dx 

is absolutely convergent for all <7, and so, by Theorem 1 of Wright (1948), 

I* | *<»>(*>-“ | *Jx 



24 E. M. Wright, The Stability of 

is convergent for every a > —d and every v. But 

r<'> | * [*e-«-°'*dx 

Jo 


fi'» | e* m dxj < | X | 
by Schwarz’s inequality, and so 


(0 


r I I 

is convergent for every v. 

For the rest of this paper we take the real part of s equal to - c and 
write s = - e + it. If 


Z.-Z*,-jV^ f 


the integral is absolutely convergent and, for x > o, 

*<*>(*) Zff)t M dt 


21 T 


( 4 - 2 ) 


by Theorem 7.3 of Widder (1941), since s {X) (x) is an integral and so 
continuous and of bounded variation. 

If we multiply (4.1) by e~* m and integrate from o to + we have 


22 +*<'>, 


(4-3) 


where 


M » 


Now 


V(s)-V v(x)e''*dx, H(s) — 

J0 ft w 

- d>^*z<’'(x)r-*dx >>(*„ - *)«“</*. 

£ *<'>(*)«—</* - e'- 1 '(x)e-*dx. 


Since both integrals are absolutely convergent, z^*~ l) (x)e~’ T —► o as 
x -*■«, and so 

Using this repeatedly we have, if v > A, 

F -1 

J- j’Z* - £ r v+ ’— 

■=x 

- J 'Z A -," 1 2<M(o) + C>(8r'- s ). (4.4) 

On the other hand, if v < A, 


A-l 


s*Z,-s’Z K + 2 i f + ’-— V“>(o) 

W — F 


( 4 -S) 



Solutions of Non-lintar Difference-differential Equations 25 
Using (4.4) and (4.5) in (4.3), we have 

r( j)Z x +j* V{s) + fH{s) + (4.6) 

It is well known that, since 00, 

r-«+<•> gimt 

I — ds-o (4.7) 

• -e-i<o * 

when x > o’. Again, every zero of t(j) has its real part less than or 
equal to - d < -c. Hence by Lemma 2 (iv) of Wright (1949 a), since 

s = - C + it, 

| t(s) I > JCI f I( 4 - 8 ) 
Using (4-6)-(4.8) in (4.2), we have 

(HI 

Now 

r dt r jt 

J-. I f c*+t* 

and so, by Schwarz’s inequality, 

(JU 

< m 1 *+2 2 1 ",.w 1 ■)<*• 

By Parseval’s Theorem and (2.5), 

j" | H„fs) | ■*/-£' | *<"(*'-x) | *e*"dx < 
and, by (2.1) and (3.4), 

r I v{x) I * e**dx - f *"| 0(*, y) | * e u *dx 

J -« Jo Jo 

< Kt*^i?{K*e-«*)dx 


< AVI 


vw 


dfV-t*A\e), 


where A («)-*■ o as « -► o, by (2.2). 



26 Solutions of Non-linear Difference-differential Equations 
Hence, by (4.9), 

| *<*>(*)*« | < Nfi + KtfA{€). 

Wc now choose c small enough to make AV 4 («) < J and then S small 
enough to make Kfi < Jc. Hence, when x = (, 

I *<*>(£) I < ie*-* 

which contradicts (3.3). 


REFERENCES TO LITERATURE 

Bellman, R., 1947. “On the boundedness of solutions of non-linear differential 
and difference equations ”, Trans. Amer. Math. Soc., lxii, 357-386. 

- j 1949. “On the existence and boundedness of solutions of non-linear 

differential-difference equations ”, Annals of Math., l, 347-355. 

Cesari, L., 1939. “Sulla stabalit& dellc soluzioni delle equazioni differcnziali 
lineari”, Ann. Scuola norm, super. Pisa , vm, 131-148. 

Levinson, N., 1946. “The asymptotic behaviour of a system of linear differ¬ 
ential equations ”, Amer. fourn. Math., lxviii, 1-6. 

Liapounoff, A., 1907. “Probleme g^n£rale de la stabilite du mouvement ”, 
Ann. Fac. Sci . Univ. Toulouse (a), ix, 203-475. 

Perron, 0 ., 1918. “Ein neuer Existenzbeweis fuer die Integrate eincs Systems 
gewoehnlicher Differentialglcichungen ”, Math. Ann., lxxviii, 378-384. 

-, 1928. “ Ueber Stabilitaet und asymptotisches Verhalten dcr Locsungen eines 

Systems endlicher Differenzengleichungen ”, fourn. fuer Math., clxi, 41-64. 

-, 1929. “Ucbcr Stabilitaet und asymptotisches Verhalten der Integrate 

von Differentialgleichungssystemen ”, Math. Zeits ., xxix, 129-160. 

Pitt, H. R., 1944. “On a class of integro-differential equations”, Proc. Camh. 
Phil. Soc xl, 199-211. 

-, 1947. “On a class of integro-differential equations. II”, Proc . Camh. 

Phil. Soc., xliii, 153-163. 

Poincar£, H., 1892. Les mithedes nouvelles de la micanique dleste, Paris. 

Weyl, H., 1946. “Comment on the preceding paper”, Amer. fourn. Math., 
lxviii, 7-12, 

Widder, D. V., 1941. The Laplace transform, Princeton, Chapter II. 

Wright, E. M., 1946. “The non-linear difference-differential equation ”, Quart . 
fourn. Math., xvu, 245-252. 

-, 1948. “The linear difference-differential equation with asymptotically 

constant coefficients ”, Amer. fourn. Math., lxx, 221-238. 

-, 1949a. “The linear difference-differential equation with constant co¬ 
efficients ”, Proc. Roy. Soc . Edin ., A, lxii, 387-393. 

-, 1949“Perturbed functional equations”, Quart. fourn. Math., xx, 

155-165. 


{Issued separately May 17, 1950) 



The Simplest Form of Second-Order Linear Differential Equation 27 


III.—The Simplest Form of Second-Order Linear Differential 
Equation, with Periodic Coefficient, haying Finite Singu¬ 
larities.* By Enzo Cambi. Communicated by Professor A. C. 
AlTKEN, F.R.S. (With Two Text-figures.) 

(MS. received May 11, 1948. Revised MS. received April 4, 1949. 

Read January 10, 1949) 


Synopsis 

A differential equation of the second order, arising in problems of disturbed oscillation, 
such as occur in frequency modulation, is considered. The nature of its solutions is 
examined by the method of continued fractions. The cases in which the solutions are 
periodic, and the regions of stability and instability (lability), arc determined according 
to the values taken by the two parameters involved. 


I. Introduction 


I. When the essential parameters of an oscillating system of any nature 
are periodic functions of the time, the performance of the system is 
defined by a differential equation, which can be regarded as derived from 
the harmonic equation 


<Py . 


by replacing the constant a>* (depending on the values of the parameters 
when they are supposed fixed) by a periodic function of the time-variable x . 

Depending on the nature of this periodic function two general classes 
of equations can be defined, according as the function is an integral 
one, or has finite singularities in the complex x-plane. 

The distribution of the singularities of the coefficient of y determines 
that of the singularities of the solution of the equation. The simplest 
cases of the above classes are respectively when the periodic coefficient 
reduces to a binomial expression, of degree 1 or - i, for instance 


d*y 

— +/*(i - ay cos x)y-o 


(1) 


or 

1 + ay cos 

* The author is greatly indebted to Dr J. C. P. Miller for having revised the English 
text and for having made many helpful suggestions and remarks. 




28 


Enzo Catnbi 


Equation (i) is the well-known Mathieu equation; equation (2) 
arises in many problems of disturbed oscillations, such as in frequency- 
modulation, where it gives an exact description of the performance of 
the system, in the non-dissipative case, that is. 

2. It is evident that equations (1) and (2) are equivalent, when y is 
so small that its square is negligible. 

The Mathieu equation can be regarded as the simplest case of Hill’s 
equation, where the coefficient of y may be expressed in the general form 
of a Fourier scries valid in a strip of the complex plane enclosing the 
real axis. It is therefore of importance as an approximating equation, 
providing approximate solutions of a Hill’s equation. 

The nature of equation (2) substantially depends on the magnitude 
of the parameter 2y; when | 2 y | > J the singularities of the equation 
are real and the coefficient itself cannot be expanded in a Fourier series, 
so that (2) is not even a Hill’s equation in this case. 

On the contrary, when | 2y | < 1 the singularities are complex, and 
the coefficient is analytic in a strip of the x-plane, bounded by the straight 
lines containing the points given by cos x — ±1 /2y, where it can be 
expanded in a Fourier scries. In this case the equation could also be 
solved by Hill’s general method; but unless | y | is very small this method 
leads to complicated expansions, difficult to interpret physically. 

The main purpose of the present paper is to expound a method more 
convenient than Hill’s method, applicable in the case 2y < 1. In physical 
applications this case is often the only significant case; for example in 
frequency-modulation, when 2y can be regarded as a relative modulation. 

The convenience of the method set out below is so remarkable when 
y is small that it is thought that equation (2) may conveniently replace 
the Mathieu equation as an approximating form for Hill’s equation. 

The cases | 2y | = 1 and | 2y | > 1 arc discussed separately, since, in 
these cases, the equation has quite a different nature from that which 
holds when | 2y | < 1. Owing to the comparative unimportance of these 
cases in physical applications, they are treated below in less detail. 


II. Solution of the Singular Equation (2) when 2y< 1 


3. We write equation (2) in the form 

d*y 

(i + aycosxJ-^+zV-o, 


<*') 


where we regard 2y and p* as real positive parameters, with 2y < 1. In 
this and in later sections we regard y as positive, no loss of generality 
being incurred thereby. 



The Simplest Form of Second-Order Linear Differential Equation 29 

The coefficient of d^yjdx* being periodic, Floquet's theory ensures 
the existence of solutions that may be represented in the form of the 
product of an exponential term by a periodic function, expressible as a 
Fourier series valid in the strip above defined, 




(3) 


We prefer here the form e* 9 + to the more usual If w 0 is real, represents 

a (central) trigonometric component of period uj 2 w. 

In this expression u % is evidently indeterminate to the extent of an 
additive integer. Further, since the left-hand side of equation (2') is 
even in x , the expression for y obtained by replacing x by - x is also 
an integral of the equation, obtained formally from (3) by changing the 
sign of u 0 . Hence: Any equation giving the condition to be satisfied by 
u 0 must at the same time be satisfied by a number of the form ±(u a ±n). 

By substituting for y , in (2'), a series of the form (3), where u is 
written for u Dl and by equating to zero the coefficient of every component 
exp [i(u±n)x] f we obtain an infinite system of homogeneous linear 
equations satisfied by the B n . The general equation of the system may 
be written as 


yA n+l + G(u + n)A n + yA n _ l - o, -2, -1,0, 1, 2, . . (4) 

where * 

+ G(u)~ i-£ 

In considering the above general equation it is logical to regard A n 
as a function of u + n, let us say A n = V(u + n ); we then find that all 
equations (4) for which the suffixes n are all positive (we shall call these 
positive equations) are satisfied for any value of «, provided that the 
function V satisfies the linear difference equation f of second order 

y V(u +1) + G(u) V(u) +y - 1) ■■ o. (5) 

Similarly the negative equations (4), n <0, are satisfied if A- n is 

* If u is such that the series e 49 m 2 Zfi m e < * m is an integral of (2*), then the series 
*e* mm is proportional to < Py { dx % % that is, to yfti +2 y cos x). 
t For information concerning difference equations, the treatise by N. £. Ndrlund, 
Vorhsungen titer Differeneenrecknung f Berlin, 1924, may be consulted. 



30 


En mo Combi 


replaced by V( -# + »), where the function V satisfies the same difference 
equation as above; this follows since G(u) is an even function * of u . 

All the integrals of a difference equation of the second order, such 
as (5), are known as soon as two linearly independent integrals constituting 
a fundamental system arc known. If we divide (5) by V(u ) and solve 
by recurrence for V(u)/V(u +1) or for V(u)fV(u - i), we obtain two 
independent expressions for these ratios, given in the form of continued 
fractions as follows:— 


V{u) 


-y 


V(u + 1) G(u) - G(u -1) - G(u - 2) - * 

V(u) _ 1 _y*__ 

V{u - 1) G{u) - G(u + 1) - G(u + a) - 

Since G(u ) is even, we can express both ratios in terms of a single 
function of u defined by the continued fraction 


v(u) = 


G(u) - G(u +1) - G(u + 2) - 


which converges for all values of u and p t provided that 2 y < 1. Two 
different and independent expressions for the ratio of two consecutive 
values of the function V , both satisfying the difference equation, are then 


v (") , % 


V(t*) 
V(u- 1) 


-yv(u). 


( 6 ) 


Assuming an arbitrary initial determination of V(u ) in a unit interval 
of real values of u, we may, by repeated multiplication by one or other 
of the above expressions for the ratio, define two independent solutions 
of the difference equation throughout the whole real M-axis. 

The resulting- functions are not as a rule analytic, but may, for example, have 
discontinuities at ail points which are congruent with the limits of the initial interval. 
Analytic solutions of {5) may actually be obtained, but their determination is unnecessary 
for our purpose, since only the values at congruent arguments are required. 

4. The continued fraction - yv(u) tends to the value 

-O-v'T^, 


* If A $ is regarded as belonging to the positive system, for *r=o t it is expressed as 
V(u); if it is regarded as a term of the negative system it is given by V( - »). The 
discrepancy is only apparent, however, since the function V is undetermined to the extent 
of an arbitrary factor, and A % itself is an arbitrary constant of the differential equation. 




The Simplest Form of Second-Order Linear Differential Equation 31 

the modulus of which is smaller than unity * (if 2y < 1) when u tends 
to infinity through positive values. 

Assuming arbitrarily P(u) = A 0l and evaluating V(u + i) % V(u + 2), • - ■ 
by means of the second of the ratios (6), thus 

Ai - r(# + i)~ -yv(u + i)At\ A % - -yv(u + 2)A v = f^u + 2)v(u + i)A Q , 

and so on, we obtain a succession of numbers A nt converging to zero 
as n tends to infinity, and satisfying the positive set of equations (4). 
Similarly, assuming V(-u) = A 0 , and evaluating 

^-1- F(-a +1)- + AV( - u + 2) -y*v(-u + i)v(-u + i)A 0f 

and so on, we obtain a set of values satisfying the negative equations of 
the system, and also converging to zero as n tends to infinity. 

The equations of the linear system belonging to values of n respec¬ 
tively positive or negative are thus satisfied independently by the above 
expressions, for any value of u . The possible values which can actually 
be assumed by u are then defined by the central equation, for n =0, which 
has not yet been considered, and which we may write as 

A\ _. v A_ | 

y- +<?(«)+y — -o, 

''*0 -™o 

that is, 

-■fv{-u + i)+G(u)-y*v(v+i)™o. (7) 


5. Equation (7) replaces Hill’s determinantal equation, to which it 
may be shown to be equivalent. We shall refer to it below as the 
“resonance equation", since, if w 0 * a real, it gives the natural frequencies 
of the oscillating system. If u 0 is a root of (7) and the terms A are 
computed by the above recurrences, with the assumption that A 0 = 1, 
the series 


A 


+inx 

n c 


represents the second derivative tPyJdx* of an integral of (2'); that is, 
in virtue of the equation itself, and integral of (2') divided by 1 +2y cos x. 
A series representing the integral itself may be written as 

-ao 

where B n =u^A n /(u 0 -\-n)\ so that we have also £ 0 =i. 

* If ay > 1 the continued fractions diverge in any case, and the difference equation 
cannot be solved by means of continued fractions (Ndrlund, 1924, p. 441). 



32 


Ehmo Cambi 


III. Discussion of the Solution 

6. In virtue of its definition, the function v(u) satisfies the recurrence 
relation 

+ ( 8 ) 

v(u) 

from which, by recurrence, any value of the function may be determined 
as soon as it is known in a given unit interval of values of the argument. 

In particular, the functions v appearing in the general coefficient 

o+ 1 M±« 0 + 2 ) • • • *(±* 0 +»), 

can be deduced from the initial values v(i + w 0 ), the computation of the 
continued fraction being required only for these two values. 

In view of the recurrence formulae (8), the resonance equation (7) can 
be written 

- -yv(-H + 0 “° or (7') 

yv{u) 

It is easily seen, again by (8), and by the fact that G(u) is even, that if 
m 0 is a root, then any number of the form ±(w 0 ±«), where n is an integer, 
also satisfies the equation, in conformity with the general conclusion 
regarding the exponential term of Floquet’s solution. 

The roots of (7'), however, cannot all be determined with equal ease, 
since the roots that are relatively distant from p and -p respectively 
are extremely difficult to locate, and may also prove refractory to numerical 
tabulation, even if the tabular interval is very small. The 11 lateral 11 
roots owe their existence, in fact, to the existence of odd poles of the 
expressions on the left side of (7'), the sharpness of which increases 
rapidly (fig. 1) with increasing distance from \ u\ = \p\. 

The root u Ct which is close to u=p and tends * to p when y —► o, is 
on the contrary determined very easily, since the left side of the resonance 
equation, written in the first of the forms (7'), varies almost linearly 
with u in the vicinity of u=p\ all the other roots are then automatically 
known also. The determination of the "characteristic exponents", which 
is the crucial point in the resolution of any differential equation with 
periodic coefficients, is thus formulated and solved, in the case of the 
present equation, with exceptional simplicity. 

* In the case y— o, the resonance equation is reduced to < 7 (w)— o, that is, u*=ft § . 
The differential equation is then simply a harmonic equation the general integral of 
which is 



The Simplest Form of Second-Order Linear Differential Equation 33 

All the roots of (7*) that differ by integers from one another give rise 
to the same set of values of the amplitudes A n (in relative magnitude), 
that is, to the same integral of the differential equation. Thus, as is to 
be expected, only two independent integrals of the equation itself can be 
obtained, depending on the form («+« 0 or n-u 0 ) of the chosen value 
of m, where u 0 is the “central” positive root, namely the one tending to p 
when y -v o. 


4 5 6 7 a 



Fig. I.— Behaviour of the ex pr cm ion i/[y*(*)] -yp( -• + 1 ) around u —p. 

[i/p «=*o r i5 ; y ■O'lJ 

7. The ratio of two consecutive coefficients, A n+ i/A n9 or A^-JA^, 
of the Fourier series JlA n e imw 9 that is, the number -yv(n + 1 ±u Q ) t tends 
to -(1 - yf\ -4y*)/2y as n -*■ «. This is smaller than unity in absolute 
value, since 2y < 1, and the series converges absolutely for all values of x 
such that 

I I < I ay/(i - vI -4y*j I - I (i+Vi-4/j/ay |. 

These values of * are contained in the strip of the complex plane 
limited by the straight lines 

I A*) | -1 log [(1+Vi - 4 y*)/*y] I. 
which contain the singularities given by 2y cos x= - 1. 

P.R.S.B.—VOL. LXIIT, A, I949-50, PART I 


3 




34 


Enso Cambi 


IV. Approximate Expressions, valid in the Case of Small y 

8. The first step of the solution is the determination of one root of 
the resonance equation. The determination from equation (7) of the exact 
value of the root u 0 lying near p is not difficult, unless 2y is extremely 
close to unity. An approximate expression, valid when y is small, on 
which further approximations may be based, is easily obtained by 
replacing the continued fraction v(u) by the first convergent i/G(u ); this 
can be done with a relative error of the order of y 1 . It must be supposed 
that p * The left side of (7), when w 0 =p, has the value 

-yM P + 1 ) -yM -/ + *) - - */(3/ a - 0/(4/* - 0, 

and its first derivative, to the same order of accuracy, is 2 jp\ so that an 
approximate value of the root is given by 

*0 - P {I + y*( 3 /* - 0 /( 4 /* - 1)} 
with an error of the order of y 4 . Correspondingly, 

A ±J A ±(n- 1 >- -y^n±u 0 )^^y(n±u^l{(n±u 0 ) t -p t }; ( 9 ) 

the value of to be introduced in this expression can be deduced, for 
instance, from the above approximate formula. 

If the difference u 0 -p may be neglected, we can assume for the ratio 
the simple expression 

A ±J A Ll»-» —~y(» ±/)*/»(« ± 2/)- (io) 

The corresponding expressions for the coefficients A, assuming A 0 —i, 
are respectively 

A ~(- Y' n±U ^ _ 

±B (±«o)* (*+/±«o)l {n-f±u^\ 

7 (±/)» n\{n±2p)\' 

with the usual extension of the meaning of the symbol x\ to non-integral 
values of the argument. 


V. A Numerical Example 

9. The performance of the simplest frequency-modulating circuit, that 
is, of a variable-capacitance (or variable-inductance) resonant circuit, is 
given, in the non-dissipative case, by equation (2), where p is the ratio 
of the (static) resonance frequency to the modulating frequency, and 2y 
is the relative modulation of the variable element. In the present example 



The Simplest Form of Second-Order Linear Differential Equation 35 

we assume ^ = 20/3, 2y = o*2. (Such values, wc may observe, are unusual 
in the technical applications, where p is usually very large and y very 
small.) 

The approximate formula of § 8 then gives for the central resonance 
the value tt 0 =6*7i657, to an accuracy of the order of /=o*oooi. 

If we compute the expression on the left of the resonance equation 

- - yMi - *0) + C(*o) - yM* + * 0 )=o 

for this value of u 0t we obtain -0*00041 82359. Since thus is negative, 
and since the slope of the corresponding curve is positive and of the 
order 2jp = 0*3 but slightly greater than 0*3, the actual root is larger than 
the above approximation, but smaller than that which could be found 
by assuming a slope of 0*3 for the curve. We thus locate the root between 
6*71657 and, say, 6*718. 

We find, in fact, that P( 6 "jiS) = ±0*00011 31942. Linear inter¬ 
polation then gives an improved value of « 0 , namely « 0 '—6*71769 54111. 
Similarly, P(u Q ')— ±0*0000000215 38, and a further linear interpolation, 
using this and the former P(6- 718), gives si 0 '=6-71769 53531 34, a value 
which satisfies the resonance equation to within io -1 *. 

The values of the r-functions used in subsequent computations arc 

- A(i + 0 - - 0-04375 4 i 634 43 

- yM 1 - O “ 0*02861 95457 55 

and 

G(u 0 m ) “ 0*01513 46176 88 

The terms v(n + u^) appearing in the expressions for the amplitudes 
A n can now be deduced by means of (8). 

Remark.— When computing with the recurrence formula (8), which gives v{u + \) 
in terms of v{u ), a loss of accuracy is unavoidable; and the same happens in deducing 
B n from A % (§§ 3, 5) when n is negative. In order to give all the coefficients to the same 
degree of accuracy, obvious artifices are used, such as (i) direct computation of the 
highest + by the continued fraction—so as to apply the recurrence backwards; 
(ii) independent computation of B- n (from an obvious recurrent expression) and deduction 
of A~ u from it. 

Since B 0 =Ao = i t if we assume the ( B ) series for y t the derivative 
d t yldx t will be given by the (A) series multiplied by - w 0 a ; the differential 
equation then becomes 

- (1 + ytf to + ye~ i *y^-'LA n e iin+v J x + 'LB n e iin+u J 9 « o; 

and this may be used as a numerical check. 

We give the values of the coefficients, as deduced from the above 
formulae, as follows:— 



36 


Ehmo Catnbi 


n 

Bn 

- U 'A 

p*' 

« 0 * A 

- 


Sum 

-6 

190 

-2 

0 

-187 

+0* 

-5 

28242 

-1875 

O 

-26367 

0 

-4 

15 86654 

- 2 63674 

-187 

-13 22794 

-1* 

-3 

425 365*9 

-132 27936 

- 26367 

-292 82216 

0 

- 2 

5847 38436 

- 2928 22163 

- 13 22794 

-2905 93479 

-0* 

- 1 

+ 0 39505 84199 

-0-29059 34788 

- 292 82216 

-0-10153 67194 

+ 1 

0 

+ 1-00000 00000 

-101536 71943 

- 2905 93479 

4442 65422 

0 

l 

-0 33150 09151 

+ 0 44426 54217 

-0-10153 67194 

- 1122 77872 

0 

2 

6566 11189 

-0-11227 78722 

4442 65422 

219 02111 

0 

3 

-1030 80589 

2190 21108 

- 1122 77872 

- 36 62646 

+ 1 

4 

141 7*297 

- 366 26460 

219 02111 

’5 53052 

0 

5 

- 17 90191 

55 3 0 524 

- 36 62646 

-77687 

0 

6 

2 13477 

-7 76874 

5 53052 

10344 

- 1 

7 

-24431 

1 03440 

-77687 

-1322 

+0* 

8 

2712 

-13220 

10344 

164 

+0 * 

9 

-294 

1636 

-1322 

- 20 

0 

10 

3 1 

-197 

164 

2 

0 

11 

- 3 

23 

- 20 

0 

0 

12 

0 

-3 

2 

0 

-1 


Decimal points and ciphers are omitted except in the largest entries. The entries in 
the “Sum” column arc in the tenth decimal place; an asterisk indicates that a 5, 
enforced, follows in the eleventh place. 

A comparison between the accurate values and those which can be 
obtained by using the approximate asymptotic formulae (9) and (10) may 
be useful; the comparison is shown in the following table, where the 
accurate values of A n (Column I) * are compared with those of the 
formula (9) (taken with the true value of u^) and of the formula (10) 


(Columns II 

and III):— 



H 

I 

11 

III 

-6 

2 

2 

I 

-5 

1847 

1785 

1378 

-4 

a S9683 

2 50998 

2 O6644 

“ 3 

130 27736 

125 93840 

108 48806 

- 2 

2883 90412 

2790 36452 

2501 50150 

- 1 

0-28619 54576 

0-27817 32090 

0 26036 03604 

0 

1-00000 00000 

1-00000 00000 

1-0000000000 

1 

. - 0-43754 *6344 

-0-39397 62919 

-0-41007 75194 

2 

0*11057 85895 

948903916 

0-10043 92765 

3 

-2157 06307 

-1792 55699 

-1915 40061 

4 

360 72*33 

292 38273 

3 i 4 323>4 

5 

- 54 46822 

-43 23224 

-46 67222 

6 

7 65116 

5 96134 

6 45543 

7 

- 1 01875 

-78047 

-84712 

8 

13020 

9819 

10677 

9 

- 1611 

-1197 

-1304 

10 

194 

142 

>55 

11 

-23 

-16 

-18 

12 

3 

2 

2 


* In many physical problems the coefficients A m are of greater importance than 
the B mt which in any case are readily deduced from them (J 16). 



The Simplest Form of Second-Order Linear Differential Equation 37 

If we note that the assumed value for y (namely 0-1) can actually be 
regarded as very high for most physical applications, we readily conclude 
that the approximate formulae following (9) and (10) (the latter being 
extremely simple for computation) are very convenient for a preliminary 
survey of the approximate distribution of the components. 

VI. Stability of the Solutions 

10. Any differential equation with periodic coefficients always has 
regions of instability ( lability ); that is, values of the parameters exist, 
for which one or both of the independent integrals increase or decrease 
indefinitely in amplitude as the independent variable increases. It is 
obvious that, so long as u, is real, the absolute value of an expression 
such as exp ( + iu 0 x)'ZB n exp ( + inx) remains unchanged when x increases 
by 2 it. If «g is complex, on the other hand, each of the two integrals is 
then multiplied by a constant, one less and one greater than unity, when x 
increases by 2 it. More precisely, Floquet’s theory states that two indepen¬ 
dent integrals of the second-order equation may be obtained, each 
satisfying a relation of the form F(x + 2ir)=pF(x), where p is a complex 
constant; and that the two possible values of p are given by a quadratic 
equation, depending only on the constants of the original equation. 

In the present case, the two possible values of p are known in the 
form exp (± 2mu^), so that the quadratic equation is written at once as 

p'-2p COS 27TUq + 1 - 0 , 

having the discriminant 4/?'= - sin* 211*0. 

So long as * 0 is real, we know that p is complex with modulus unity, 
and the solutions are both stable; p can pass to values having different 
modulus only at points which make the discriminant vanish, that is, 
values which make Uo=n or n + ^, where * is any integer. 

11. In the (p, y) plane, therefore, only the curves given by pairs of 
values of p and y making « 0 an integer or the half of an odd integer can 
be boundaries of the regions of lability, where these exist. The equations 
of these curves are obtained at once by noting that the resonance equation 
must be satisfied by u=n or u=n + \ respectively; that is, owing to the 
congruence of the roots, by u = o or u =| respectively. 

The conditions are obviously 1/0(1) =0 or y*0*(J) = 1; that is, 




3 « 


Ettso Cambi 


i 


±y-4 /*- 




— O. 


(ia) 


In the case of equation (12), two families of curves exist which make 
Uq = ti 4- i, according to the sign preceding the term y. The two curves 
corresponding to the same value of Uq bound, in fact, a region of lability 
in the ( p t y) plane; they may be found numerically by solving for p % by 
successive approximations with given values of y. 

Equation (11), on the contrary, gives a single family of curves, that is, 
one curve for each integral value of m 0 ; it will be proved later that there 
are no regions of lability bounded by curves « 0 = integer. Since u % is 
equal to p at y = o, the curves of families (n) and (12) obviously start 
at the points / = * or/> = » + | of the p- axis. 

12. Equations (11) and (12) can be solved for/ 1 with relative ease, even 
for values of y very close to When y is large, of course, the con¬ 
vergence of the procedure is rather slow, since the ( p t y) curves become 
very flat near y = i, where they have a horizontal tangent.* In the 
numerical case examined above we get the following values for some 
points of the boundary curves. 


Values of p (for given y) making m 0 =J, 1, 3 /a 




=* 

* 

«0 

= 3/2 

0*0 

0*50000 

0-50000 

1*00000 

1-50000 

i- 50000 

0*1 

0*47*37 

052165 

0*99326 

1-48905 

148907 

0*2 

0-43448 

053659 

097170 

1-45409 

1-45428 

0-3 

038463 

o-54360 

093048 

1 38673 

138763 

o *4 

0*31159 

o-5377» 

0*85374 

1 -26028 

2-26325 

045 

0*25207 

0-52348 

0*78327 

1-14183 

114789 

o*5 

0*00000 

0-35355 

0*35355 

(0-35355) 

(0-35355) 


It is seen at once that the regions of lability comprised between two 
curves for which u 0 is the same odd multiple of become very narrow 
as u 0 increases. This is due to the fact that the two curves u 0 =n+£ 
have contact of order 2 n at the point where they cross the p-axis t that 
is, at the point ^ = » + y=o, and are all contained in the finite strip 
Y < l ( fi B- *)• 

* The computation is actually impracticable when ay is just less than 1; a method 
for solving (11) and (12) with such values of 2y would thus be of some interest. The 
s problem is, however, of no'great practical importance, and from the theoretical point 
of view, so long as it is desired only to trace the boundary curves, the knowledge of the 
terminal point at y = |, p=i ^2 (see 119) may be regarded as sufficient. 




The Simplest Form of Second-Order Linear Differential Equation 39 

13. When y = J, the continued fractions v(u) converge or not (Ndrlund, 
1924, p. 438) according as 1 - 8^* > o, as long as remains finite in 

magnitude. To the right of the point P = ^V2 there are therefore no 
points of boundary curves corresponding to finite values of m 0 . 



Fjo. 2.—The shaded area U a region of lability; the second of such regions runs 
along w # = 3/a, and is very narrow. 

The point / = jV2, y = J is an exceptional point in the representative 
plane of the parameters. The behaviour of the equation at this point 
(belonging to the case 2y = 1) is fully discussed later (§ 18). 

VII. Characteristic Functions of the Equation 

14. When / and y have such values that is either an integer or an 
odd multiple of the solutions of the equation are respectively periodic 
with period 2ir, or “half-periodic”. In the former case the factor 
p=exp (± 27 rt« 0 )» by which each integral is multiplied, has the value ±1 
for an increase 2ir in x\ in the latter case the value of the factor is -1, 
so that the function has a period 47r, its values in half-periods of length 2ir 
being equal, and opposite in sign. 

The integrals are evidently periodic for all rational real values of U& 
the period being then given by 2ir times the denominator of u 0 . Since 
real values of * 0 (making | p | = i) are possible only in the regions of 
stability, the curves (p, y) where periodic solutions exist are contained 
in these regions and never cross or touch the boundary curves. 

Consider now the (A) series, representing the derivative d*yldx * 
or y/(t -f 2y cos x). From the series exp (tUoZ^A k exp (ikx) and 





40 Ehmo Cambi 

exp ( - iu*x)LA k exp ( - ikx) we deduce two independent solutions * 
expressed in real form as 

00 OB 

y’ - 2 ^* “* (*•+*)*. y* - 2^* 3511 (“•+*>*■ 

- m - » 

If «o=« + i, each of the circular functions appearing in the above 
series occurs in two terms, so that the half-periodic eigenfunctions become 

^, + |W - 2 + A -n-t-i) cos (A + ix), 

0 

S B+ ,(x) - 2 ( A -n+k - A -n-t-0 sin (* + !*)■ 

0 

These solutions of the equations can exist when (J> t y) is a point on 
either of the boundary curves given by yv(J) = ±l. But if we consider 
the explicit expressions of the constants A in terms of t/-functions, we 
easily see that one of 

A_ n ft ± A_ n _ k , x is zero when yv{\) - ± i, 

so that only one periodic solution exists when (J> f y) is a point on any one 
of the curves Ut = n + \ y in conformity with what happens in the case of 
a Mathieu equation. 

15. On the curves «0 = »(i/t(i)=o), on the contrary, the equation 
has two independent periodic solutions of period 2v, namely 

OD W 

= 2^-«+* 008 * x 2»C*)-2 ■*-•+* sin kx. 

0 0 

The solutions are valid along the same curve u 0 = n 9 which is the 
locus of double eigenvalues. 

The possibility of the simultaneous existence of two solutions of 
period 2ir for the same values of the parameters p and y (making u 0 an 
integer) may be proved by the same method by which E. L. I nee proved 
the impossibility of a pair of such solutions, in the case of Mathieu*s 
equation. Assuming that two independent solutions 

-AT—| a % ^a x cos cos 2 X+a t cos $x+ . . . 

Z- b x Bin x + b % sin 2x + b 9 sin yc+ . . . 

can satisfy the differential equation for the same values of p and y, we 
easily find, by direct substitution in the equation, the necessary condition 

* For the sake of brevity, we speak of the ( A ) series as solutions of the equation, 
while the actual solutions, that is, the (&) series, are obtained from the former by division 
by (1 + ay cos *). The reasons for the choice of the (A) series as eigenfunctions an 
examined in g 16. 




The Simplest Form of Second-Order Linear Differential Equation 41 


*• (* + l)**n+l + (* - l) a ^«l °' 

which, when solved by recurrence, readily leads to 

a n a *+i 

— o, 

K K+i 


w * i, 2, 3# • • ■ (u) 


a condition which is obviously satisfied if b n is proportional to a n . 


If the same procedure is applied to Mathieu’s equation, we find the necessary 
condition 


a n a n +1 + a n-\ 
&n+l + ^*-1 


■■ o, which leads to 


a n *i*+l 
^n+1 




a condition which is incompatible with the convergence of the Fourier series. The 
material difference between the two cases is that, in the present one, the first of equations 
(13) does not contain a,, thus differing from the corresponding equation valid in the case 
of Mathieu's equation. 


Once the possibility of the coexistence of the two eigenfunctions has 
been proved, it is easily verified that the Fourier scries set out above 
(which can be deduced as. limiting cases from the general integrals as 
«o —► an integer) actually satisfy the equation, and constitute one pair of 
eigenfunctions of integral functions of integral order, corresponding to 
the even Mathieu functions. 

16. In the preceding sections wc have considered as characteristic 
solutions the functions expressed in terms of (A) series, although these 
actually give the second derivatives of the integrals y of the equation. 
By substituting the ^-coefficients for the corresponding A* s, wc might 
easily write the explicit expressions for the characteristic integrals jy. 

However, the (A) series actually represent new eigenfunctions of 
greater functional importance than the corresponding (B) series. It may 
readily be shown, in fact, that two different eigenfunctions of the form 
( A ) with the same coefficient of cos x in the equation (that is, having 
two different pairs of parameters, but such that yi//i 1 =y*//* 1 ) are 
orthogonal to one another over a common period. If we eliminate cos x 
between the two identities 


(1 + ay, cos x)y' +pfy , - o, (1 + ay, cos x)yf +p t t y* - o 

and integrate term by term, we readily obtain 

(/r* -Pt^yi'ytdx-^ytyi 

and the right-hand aide vanishes if the interval (x 0 , xj is a period of 
both functions. 



42 


Enzo Cambi 


VIII. The Algebraic Form of the Equation, 

AND THE Case 2 y=I 

17. The above trigonometric expansions are applicable only in those 
cases where 2y < 1 (cases which are often of more immediate physical 
interest), since the inequality in question is the condition for convergence 
of the continued fraction v(u). 

z8. Transforming the differential into algebraic form by the simple 
relation / = - exp (ix ) 9 we obtain 

t{yfi - / + + (y/* - / + y) “ +^y - 0. (14) 

The equation has the singularities /=o, / = ® f arising from the above 
transformation, and /=£ = (i ± J)/2y [where J = y/(i -4y*)] corresponding 
to the singularities of the original equation given by cos x= - i/2y. 

If 2y < 1, the two last singularities lie on the real axis and their 
abscissae are reciprocal to each other. If 2y = 1, the singularities coincide 
at the point /=z(;r=o); and finally, when 2y > r, they fall on the unit 
circle ] / | = 1 and are complex conjugate. (The circle | / | = 1 obviously 
corresponds to the real axis in the *-planc.) 

When 2y < 1, Floquet’s form of the integral is transformed, in terms 
of /, to the form /“* (or /-“■) multiplied by a Laurent series converging 
in the annular region between the two circles | /1 =(1 ±d)/2y. Hence 
the /-plane in which the function ytr* (or yt *•) is defined must not 
be cut across this region; that is, the function yt±** (corresponding to a 
periodic function of x) can be uniquely defined in a /-plane, cut only 
between O and Q and between P and «. 

When 2y > i, the width of the annular space becomes zero, and the 
Laurent series in /, if any (that is, the Fourier series in x), converges at 
most on the circle | / | = 1, that is, on the real x-axis. 

19. The algebraic form of the equation is helpful in discussing its 
properties in the case 2y> 1. When y = J, the continued fraction v(u) 
converges only if i-&p t >o, that is, if p< IV2. The point y = |, 
p = iy/ 2, actually appears to be an exceptional point of the equation, 
which, in algebraic form, is in this case reduced to 

This well-known Legendre equation can be reduced to standard hyper¬ 
geometric form by the substitution * 

* When x varies on the real axis, z varies on the straight line from 

- 00 to 00. 



The Simplest Form of Second-Order Linear Differential Equation 43 


which gives 


*-//(/- 1 ) -,«/(!+**), 






For a complete discussion of the above equation, see Tannery and Molk, Fo net ions 
Elliptiqtus , Paris, 1898, HI, 188. We write K(s) for the elliptic integral having z as 
square of the Legendrian modulus (*=£•): 


K(e) - inFfi, i; e}- -^irji + ($)** + (*'^J ** + - 

K'(*)-K(i-*)-log ' 6 + 4 ^w}, 
where /i(z) is the power-series 



The equation is satisfied by the complete elliptic integrals of the first 
kind, K(s) and K'(*) =K(i —x). In terms of / or x t two independent 
integrals y x and y ti being real on the circle | / | = i, that is, on the real 
.r-axis, are 

- ■Vi + e**K( - <r") + Vi+r^TCf - 

2 iy t - VT^t K(0 -J I - Jk( J) - V7+^K( -«'*)-Vi +V*K( -<H»). 






The elliptic integrals are single-valued in a /-plane, cut along the 
segments (», 1) or (i, o) respectively; similarly, the radicals V( I-*) 
and VC 1 “ I/O have branch lines between 1 and ® or 1 and o respectively. 
The circle |/| = i can then be described from /= 1 to /=i again, 
indefinitely; that is, x can vary from -it to it, 3 ir, and so on, with the 
corresponding point /= -e~ ix remaining always on the first sheet of the 
Riemann surface. In this case the point corresponding to VO +***) 
repeatedly describes the right half of an ordinary lemniscate, having the 
points ± 1 as foci, if we choose for the square root the value with positive 
real part. 

Correspondingly, the integrals y x and iy t9 which can be defined simply 
as being the real and imaginary parts of VO +* < *)K(-/'®) respectively, 
are both periodic functions of x f with period 2w. The functions, which 
can easily be expressed in the form of Fourier series, are the two eigen¬ 
functions K x and S lf in the case y = The double curve u 0 = 1 has then 
its end-point at y = J, / = J V2. 



44 


Enzo Cambi 


Alternatively, we may also regard the second turn of the circle | / | = i 
(corresponding to ir < x < 3 it) as described on the second sheet of the 
Riemann surf ace > the passage from the first to the second sheet taking 
place at the singular point t=i(x= ±ir) t which is the end-point of the 
cuts. 

The point corresponding to VO +***) then describes the left half 
of the lemniscate, while assumes the conjugate value 

This corresponds to the assumption of a pair of integrals defined as 
follows:— 

for 

R{Vi+***!£(-***) 1 and - //{ Vi + - c**) —y ty 

for 

7t<jc<37t : R{-Vi +e ir K( - -y x and - i/[ - Vi + e u K( - — y n , 

the same branch of the square root being chosen in all cases. The integral 
of the differential equation defined as y x for the first half of the period, 
and as —y x for the second half, is actually a 14 half-periodic” function. 
It is the second eigenfunction K| of § 14, and can easily be expanded 
as a Fourier series, which in point of fact proves to be an odd cosine 
series. The integral ( y t , y t ), on the contrary, is a mere duplication of 
the integral 

We thus conclude that the end-point of the curve-locus of the odd 
eigenfunction K|(.r) is y = /» = JV2, which is also the terminal point 

of the double curve of eigenvalue « 0 = l* 

20. The general case p*{V2 can be treated in a similar way. If 
we write 8 for VC 1 ” 8/*)» the transformation / = - e im changes the equation 

d*y 

(i+cos*)— 

into the equation in algebraic form 

*(» - o 1 ^ + (* ■ 0* + ■ °* 


and this is the Riemann /’-equation 


Pi 


o 00 1 

i + 8 

o o - t 

2 

1-8 

o 0 - 

2 



The Simplest Form of Second-Order Linear Differential Equation 45 


It is reduced to the hypergeometric form by the usual transformation 

#-//(/-!), /-*/(«-1), 


giving 




— o. 


The equation is symmetric in g and i - g } as in the particular case 
2 P % =\* It is therefore convenient to assume, as independent integrals, 
the following forms, which reduce to K(*) and K'(*)=K(i - g ) when 
S=o, and still satisfy the fundamental equation k'(*)=k(i - g ):— 


k(s) - sec JirS./'{i(i + 8), K 1 - S); i; *}, 

where 

^i(i+8),l(»-S); 1 ; +A,a + A t « , + - ■ • 


and 


1-8* , (1 -S t )(3*-S*)* 1 . 

= i+——*+--— -»+. . . 

a* (2-4)* 


k'(») - - i(A(a)[log Z-4C] + 4p,(e)}, 
/2(g) here denoting the series 

where 


- - iy + Jtt tan JrrS - |*F{ J(i + 8)}, 

y here denoting Euler’s constant, and 'F(ar) = Y’(x)lY(x). When 8=0, C 
reduces to log ( 2 , so that k'(g) reduces to K'(g), as it should. 

Two real independent integrals, valid on the circle |/| = i (the 
singular point t=i excepted), that is, on the real x-axis (the points 
x = (2n + i)ir excepted), or on the line R(e) = J (the point at infinity 
excepted), are, as above, 

Vi - k(s) + k'(»), ai>i - k(s) - k'(s), 

since k(a) and k'(g)=£(i -5) are conjugate to each other when e is on 
the line R(e) = The functions k(a) and k'(g) can be written as definite 
integrals, using Euler’s expression for the hypergeometric series, as 
follows:— 




4 6 


Enso Cambi 


The integrals can be regarded as (transcendent) modular functions, 
which remain unchanged for all those transformations of the ratio of the 
periods that leave the Lcgendrian modulus unchanged. They are thus 
apparently periodic functions of x t since the increase of x by 2 tt simply 
increases the ratio of the periods by 2 . 

Remark .—Before coming to any conclusion concerning the existence of periodic 
solutions for any value of fi, when 2 y -I, one would have to investigate more carefully 
the behaviour of the integrand, in particular its dependence on the index < 5 , which may 
assume any value, real or complex. The existence of periodic solutions for any value 
of p is put in better evidence by the discussion of the case 2y > i in § 21; the procedure 
set out there can also be applied to the present case. 

It is concluded that the first eigencurve = J ends at y = p= o; 
the second eigencurve w 0 = i , and the double curve u 0 = I, at y = p — \ Vi. 
It is regarded as very probable that all the cigencurves may terminate at 
this particular point, since the resonance equation is divergent for greater 
values of p. If this should be the case, the half-straight-linc y = £, jfr=i V2 
might be regarded as the limiting form of the eigencurve u 0 = n as n tends 
to infinity, thus permitting the existence of periodic solutions at all its 
points. 


IX. The Case 2y> i 

21. When 2y> i equation (2) is not a Hill’s equation, so that the 
determination of the characteristic exponent u 0 is possible neither by 
Hill’s method of the dcterminantal equation, nor by the method of 
continued fractions as used in the earlier sections of this paper. 

Since the singularities of the equation, expressed in its algebraic form, 
all lie on the unit circle, it is known that the periodic solutions, if any, 
or the expressions in Floquct’s form r), where F(jx) is a Fourier 

series, are valid only on the real ^r-axis. 

The determination of u 0 is possible, however, according to Floquet’s 
theory, as soon as we are able to express the value of two integrals for 
x=Xq + 2 tt in t^rms of their values at x=x 0 . 

The equation, reduced to algebraic form, is still (14). The two proper 
singularities, /=/>=*(!+J)/2y and / = 0 = (i-d)/2y, are on the unit 
circle | t | = i, where they are represented by the conjugate points P = e* 
and Q=e~ iK , where 2y cos A=si. 

Since the equation is reciprocal in the sense that the substitution l/t 
for * leaves it unchanged, the existence of an integral /(*) defined around 
l=o implies the existence of an integral/(i/rf) valid around t=*>. The 
characteristic iexponents at the four singularities are all integers, namely 



The Simplest Farm of Second-Order Linear Differential Equation 47 

o, o at /=o and /=®, and o, I at t = P and t = Q, and the irregular 
integrals there are all logarithmic (except at t—P and /=<2 when/=o). 

In the neighbourhood of /—o we immediately construct the two 
independent integrals: 

ho(0-go* lo g t + A x t + Aj/ 1 * hj* 4- . . . 

where the constants g n and k n arc given by the recurrence relations 

y(« + I) 1 'gn+1 - (** -f>*)gn + y(« “ 1 ~ O, - I, g_ ! - O, 

y{n + i ) % h m+l - («* -/*)*, + y{n - i)**,., - - 2{y(# + i)g n+A - ng H 4 - y(/f - 1)^,}. 

The scries converge inside the circle | / | = 1; the integrals g*, (/) =g ^ 1 //) 
and h a0 (^) = h 0 (i//) converge for | / | > 1 and represent two independent 
solutions valid around /=®. The integrals h arc single-valued in the 
/-plane, cut, for example, from - « to o. 

Around the singularity + d)/ 2 y, wc similarly define a regular 

integral 

where the constants p n are defined by 

(* + 1 )nPAp n h + {n(n - 1 )yP+ * a d +p*}p n + (* - 1) Vn-i - o 

and it is convenient to assume p t = £? — e ix . Similarly around / — Q we have 
the regular integral * 

g-<?)+?*('*<?)*+- • ■ 
with constants q n defined by 

- (* + *)nQAfn+i +{*(* " l )y<? - +/Vn + (« -1) V«-i =°» 

and we agreed to assume P=£ <x . Since P and Q , as well as the 
coefficients p n and q n , are conjugate, we easily see that 

gp(*)—gg( 0 » 

where x denotes the complex conjugate of x . 

Around / = P and similarly around / = 0 we can also define the 
irregular integrals 

Ap-g^.Iog (t-F) + r 0 + r 1 (/-P)+r t (/-P) B + . . 

* The integral gq is obtained from g p by substituting -d for A. Since A is purely 
imaginary, the coefficients of gq are conjugate to those of g p . 



48 


Engo Cambi 


with a similar expression for Aq, but these are of much more restricted 
interest in the present context. We may merely remark that, although 
these integrals are singular at t = P and t=Q respectively, they are finite 
in magnitude at these points.* 

The expressions gp(i//) and g<j(i//), convergent about t=Q and 
t~P respectively, are also integrals of the equation; and it is easily 
seen that, with the values assumed for the arbitrary coefficients p x and 
we have the simple relation 

Hence, when / is on the unit circle, so that /= i//, gp and gq are purely 
imaginary. 

The integrals g P and g Ql which have regions of convergence in common 
with the integrals g 0 and h 0 , can be expressed as linear combinations of 
the latter with constant (complex) coefficients. It is seen at once that if 
gj> = ag 0 + Ah 0 we shall have g Q = dg 0 + £h 0 . 

The complex constants a and 6 can be determined, when required, 
by computing the (complex) series g/> and the integrals g 0 and h 0# as 
well as their derivatives, at a point where all the expressions converge; 
for instance, at the point i/2y, which certainly belongs to the regions 
of convergence of all the series. 

22. If we suppose that the above linear relations are known, we can 
then determine the variation of the integrals g/» and g^ for a 27 r-increasc 
in the original variable x\ that is, for a positive turn round the origin of 
the actual variable /. When / describes a closed circuit round the origin 
go(*) remains unchanged, and h^/) increases by 27 rtg 0 . By eliminating 
go and h© between the expressions for g P and g Q and those of the new 
values gp and g* Q , we easily obtain f 

. f . bb ) . A* 

}* + am aTdi) Sp ~ 

. . e f . tf I 

g<j - o n g P + «.»g<j - 3 ™a 6 ^M 8p + j 1 

A linear combination F(x)=A gp(x) + Bg^x) (where and gq arc 
regarded as functions of x by means of the relation /=«-<*) will then 
satisfy Floquet’s condition 

F{x + an) -pF(x), 

* We have !>,(/>) = - h g ( Q) . - 

t The determinant of the transformation (which is the ratio of the two Wronskian 
determinants of the functions g^, g* and gp, g 4 respectively) is equal to i, by a theorem 
due to Poincari. 



The Simplest Form of Second-Order Linear Differential Equation 49 

provided that Ag P + EfSq—p(Agp + Eg^). From the expressions just given 
for g p and gq, wc find, using a lx a tt -a lt a tl = 1, that the only possible 
values of p are those given by the quadratic equation 

P*-( <J ii + a ii)p +1-0. 

Since in the present case we always have Oi 1 +a u = 2 9 so that the 
equation has the double root p = I, we conclude that for any value of 
the constants p and y of the original equation (provided that 2y > 1) it is 
always possible to obtain two independent integrals, both periodic with 
period 2 it. 

This conclusion is generally valid whenever (i) the singularities, other than o and ®, 
all lie on the circle | * | = 1; (ii) the singularity is logarithmic. The conclusion is also 
true for other values of the characteristic exponents at *=o, subject to conditions which 
are not difficult to formulate. 

23. Two values of the constants A and B making p=i (that is, such 
that is an integer) are A =b t i?= - b . For the corresponding Floquet 
solution we obtain 

F(x)~ 6 g l .-Sg Q “Cg 0 . 

This means that g 0l when regarded as a function of x t can be expressed 
as a Fourier series valid on the real A>axi 9 , corresponding to the circle 
|*| = l. Since g 0 is also defined, and single-valued, inside the circle 
|*| = I, the corresponding Fourier series will be valid in the upper half¬ 
plane I(x ) > o, 

The second periodic integral will obviously be g 0 (i/*) = g a) (*), defined 
for | * | > I, and representing a Fourier scries in x t valid in the lower 
half of the *-plane. Since g 0 and g^ are conjugate to each other when 
| *| = i, two real integrals valid for |*| = i arc obtained by considering 
the real and the imaginary parts of g 0 (*) respectively. They give, in 
terms of x , two real Fourier series, which are valid only on the real 
:r-axis. 

The validity of the series g 0 (/) for | * | — 1 can also be proved directly. Since its 
coefficients satisfy the relation 

y<* + i)V«+i - (« a -/*)*»+y(« - OVn-i - 

the expression V(n)=n % g U i considered as a function of n , satisfies the usual difference 
equation 

y V(n +1) - (i V n +y V(n -1) - o, 

which tends asymptotically to 

y V(n +1) - V(h) +yV(n- 1) -o. 

F.R.8.I.—VOL. LXIII, A, 1949-50, FART 1 


4 



Ehmo Cambi 


SO 


Two independent (complex) integrals of the above equation are P 9 ^** 9 * and 
Qn = e ~<nx t wo independent integrals in real form are consequently cos nX and 
sin nX. Since g m is always real, n % g m will tend to a linear combination of cos nX and 
sin nL Such expressions have no definite limit (that is, the expression does not 
converge) but are limited in magnitude. This means that | g m \ < Kjn ■, AT being a 
suitable constant. 

It follows that the series g 9 tends asymptotically to a series the terms of which are 
respectively smaller in modulus than those of the series /CH/Jn*, which converges 
absolutely for | f| = i. The convergence of the Fourier series is actually slow, so that 
a method giving a closed expression for its sum is much to be desired. 


X. General Conclusions and Summary 

24. When 2 y < 1, the equation considered, that is, the equation 

<Py /* 

-ri+- y- o» 

dxr 1+2 y cos x 

has periodic solutions, with period 2 tt or 4w, only in those cases where 
the parameters p and y are such that the representative point belongs to 
a curve where 2 u 0 is an integer. In the regions of stability bounded by 
such curves, with 2 u 0 an odd integer, an infinity of curves exist giving 
periodic solutions with periods that are multiples of 2tt. 

Finally, regions of lability also exist, where the equation ha9 no stable 
solutions. With the exception of the first one, bounded by the two 
curves tt 0 =l» starting at p=i, y=o, the regions of lability are extremely 
narrow; the region comprised between the curves *,, = 3/2 is already of 
negligible width. 

When periodic solutions exist, they are given by Fourier series valid 
in a strip of the x-plane, the width of which increases with decreasing y . 

In the cases 2y > i, on the contrary, stable solutions exist everywhere; 
they are periodic with period 27 r, and are represented by Fourier series 
converging in a strip of zero width, that is, on the real axis only . All 
the solutions are finite in magnitude at the singular, real points given by 
2y cos*+ 1=0, but in general their first derivative has a logarithmic 
infinity there, so that the Fourier series cannot be differentiated term 
by term. 

If the equation is written in the general form 

(a + A cos *)— +y - o, a - 1 //■, b - ay//*, 

the positive half-quadrant b > a is a region of stability; in the lower 
half of the quadrant the regions of stability are bounded by curves which 



The Simplest Form of Second-Order Linear Differential Equation 51 

are tangent to the straight line a=b at the point a=b = 8, with the 
exception of one of the curves = }, which has the line a =b as asymptote. 

The argument of § 20 does not establish this statement finally, though it may be 
regarded as very probable. The double curve u 0 = 1, and one of the curves 
actually converge to this point. 

For the general treatment of linear equations with periodic coefficients, 
and the questions of stability and lability, we may cite the works given 
in the list below. 


REFERENCES TO LITERATURE 

NOrlund, N. E., 1924. Vorlesungen Uber Differentenrechnung , Berlin. 
Sansonb, G., 1941. Le equationi differentiali nel campo realty Bologna. Esp. 
Chapter VI. 

Strutt, M. J. O., 1932. Lami-sche , Matkieu-sehe , und verwandte Funktionen , 
Berlin. This contains an extensive bibliography on Hill’s problem. 

-, 1943. Eigenwaarde-Krommen bij Problemen van Hilly I and II, 

Amsterdam. 

-, 1943. Eigenfuncties bij Problemen van Hilly I and II, Amsterdam. 

Tannery, J., and Molk, J., 1898. Fonctions ElliptiqueSy Paris. 

Whittaker, E. T., and Watson, G. N., 1927. Modem Analysis, Cambridge. 
Esp. Chapter XIX. 


(Issued separately June 16, 1950) 



52 


A. C. Aitken, Studies in Practical Mathematics 


IV.—Studies in Practical Mathematics. V. On the Iterative 
Solution of a System of Linear Equations.* By 
A. C. Aitken, D.Sc., F.R.S., Mathematical Institute, 
16 Chambers Street, Edinburgh, i. 

(MS. received April 28, 1949. Read July 4, 1949) 

Synopsis 

The convergence of customary processes of iteration for solving linear equations, in 
particular simple and Seidclian iteration, is studied from the standpoint of matrices. A 
new variant of Seidrlian iteration is introduced. In the positive definite case it always 
converges, the characteristic roots of its operator being real and positive and less than 
unity. 

1. Preliminary Considerations 

There has appeared recently a scries of papers (Bodewig, 1947) reviewing 
various extant methods of solving simultaneous linear equations and 
assessing their relative efficiency. It is claimed, on the basis of this 
assessment, that the oldest and most elementary of these methods, that 
of successive elimination of the unknowns followed by resubstitution, 
involves fewer and simpler operations than any of its more recent com¬ 
petitors. The assessment of efficiency is based on the number of operations 
required before the solution is complete; for example, it is stated that 
in the case of a system of n equations, the whole process of solution by 
the method of elimination requires «(» a + 3*- 0/3 multiplications and 
n(n - i)(2 n 4 - S)/6 additions. 

These assessments must be viewed with respect; but are equally 
subject to qualification and revision. For example, many operations of 
addition are not separate from those of multiplication, but on the machine 
are cumulated along with them. Again, copying down is itself an 
operation that takes a relatively large proportion of time and involves 
a risk of error; any operation that minimizes the necessity of such 
copying is advantageous. There are certainly cases, especially when the 
matrix A of the system is dominated by its diagonal elements, in which 
Seidelian iteration converges well. Further, if in such a case the 
greatest characteristic root of the iterative operation is real and rather 
small compared with 1, powerful methods are available for gaining at 
one step a much enhanced approximation. Finally, the Southwellian 

* This paper was assisted by a grant from the Carnegie Trust for the Universities 
of Scotland. 



V. On the Iterative Solution of a System of Linear Equations 53 

technique, which is that of Seidelian iteration with an admixture of 
opportunism, has stood the test of trial in the engineering applications 
with which it deals. 

In the present paper, we first make some general observations on the 
rapidity of convergence of iterative processes (some of this crosses ground 
already traversed by Dr Bodewig), and then go on to introduce a new 
variant of the Seidelian process, based on an operator which, when the 
matrix A of the system to be solved is positive definite, has characteristic 
roots real and confined to the range o < A < 1. The advantages of this 
will be seen in due course. 

For the sake of economy of space, the numerical examples refer to 
systems of low order, and are indeed capable of being solved just as 
rapidly in other ways. They are intended to serve as illustrations of 
principle, and are designed to exhibit to the eye, in a not specially 
favourable case, those properties of convergence that are derived in the 
text. 


2. The General Process of Iteration 

The usual processes of iteration have often been described ( e.g . in 
Frazer, Duncan and Collar, 1938; Bodewig, 1947) and are easily 
classified. Let the system of equations be denoted in matrix notation 
by Ax = h. The matrix A can be expressed in infinitely many ways as 
B-C. We choose B as a non-singular matrix such that either B~ l is 
easily evaluated as a first step, or else the effect of B * 1 on any vector can 
be obtained by simple arithmetical routine. The choice of B is wide; 
in the simplest case it could be scalar; in the next simplest, diagonal; 
in the next simplest, upper or lower triangular. Iteration consists in the 
use of the recurrence relation 

Bx^'-Cxl' + h, that is, xU+'-B-'CxH + B-'A, (1) 

the best working rule consisting in the use of vector-differences, thus, 

B{x"+ l (2) 

The iterated vectors x {t are derived in this way from an initial vector x i0 t 
adopted as a first approximation to the vector x of solutions. 

The rapidity of convergence of the sequence x (t towards x thus depends 
on the latent roots of the matrix B~ l C . The characteristic equation of 
the iteration is therefore |A 5 -C|=o, and if its roots, in descending 
order of moduli, are A lf A a , . . ., A*, convergence will be assured provided 
that | Aj | < 1, and will be the more rapid the smaller | A x | is. In certain 
cases A x will be real; if this can be ensured, methods for accelerating the 



54 


A. C, A it ken, Studies in Practical Mathematics 


convergence can be applied. These methods can, in fact, be applied 
even when | Aj | > i, provided that | A s | < i, in much the same way 
that i + A + A 1 + . . . + A*- 1 , when provided with the remainder term 
A‘/(i“A), will yield i/(i-A) for all values of A; but this case has 
little practical value. If A t is one of a pair of conjugate complex roots, 
accelerative methods exist (Aitken, 1925, p. 302), but are less convenient 
to apply. 

A first classification of iterative methods may be based on the nature 
of B. If B is purely diagonal, for example if it is the "vertebra” of A, 
so that ba — a iU we have the type of iteration that is often called simple 
or ordinary , but could equally well be called diagonal . If B is "lower 
triangular", that is, if —o, * <r j\ b u ** o, we have what may be called 
lower triangular iteration. Not essentially different is upper triangular 
iteration. Seidelian iteration (Seidel, 1874), 85 usually understood, is the 
special lower triangular iteration where b u =a iiy ba—a^, i>j , b it = O, 
1 < j\ or any similar iteration with the equations in some permuted 
order. A variant due to Morris (Frazer, Duncan and Collar, 1938, 
p. 132) uses a preliminary evaluation of B~ x and B~ x h t but the results 
at each step are the Seidelian ones. 

At this stage we may illustrate, by simple examples of the 3rd order, 
the characteristic equations of the simplest diagonal iteration (b ii =a ii ) 
and of lower triangular Seidelian iteration respectively: 


Aa u 


*ia 


Xa n 

<»i. 

a ia 

a tt 

Xa n 

*aa 

(3) 


Xa„ 

*23 

*31 


a * 83 


^*31 


**33 


If A is symmetric, the roots of (3) are real, though not necessarily all 
less than 1; in (4), on the other hand, where we note the unsymmetrical 
disposition of the A’s, the roots may be real or complex according to 
circumstance, even when A is symmetric. In justification of this last 
remark we may refer to the simple positive definite example of the 3rd 
order, ^ = 3, <*« = <*« = 2, i+j. 

In simple iteration it is known (Bodewig, 1947), and will be proved 
here, that if A is symmetric and positive definite , it is always possible 
to find a diagonal matrix B such that the numerically greatest latent 
root Aj of the iterating operator is not only real but such that | Aj | < 1. 
It is also known (Whittaker and Robinson, 1929, p. 255; Bodewig, 1947) 
that in this positive definite case Seidelian iteration always converges. 
We proceed to prove these facts from the standpoint of matrix theory. 



V. On the Iterative Solution of a System of Linear Equations 55 

3. The Positive Definite Case 

Let A be positive definite. Then its diagonal elements a it are all 
positive and constitute a diagonal matrix D. We may normalize the 
given equations Ax—h to 

D~ X AD~ x y — D~^k, where y^lfx. (1) 

This normalization is for the purpose of theory only, and need not be 
resorted to in practice. There, it is enough to semi-normalize, obtaining 
the non-symmetric system D~ 1 Ax = D~ 1 h. However, since 

D~'A - D-^n-'AD-^lA, (a) 

the matrices D~ X A and D~lAD~l are similar and so have the same 
latent roots. Each has unit diagonal elements, and so the trace is equal 
to #; hence, the latent roots being all positive, we must have A, < n, 
X n > o. By writing D~ X A =\n{I- C), h—\nk , wc reduce the equations 
to the form (I- C)x = k, where the latent roots of C evidently lie in the 
range - 1 < A < 1. A simple iteration based upon 

*«+i - C^+k (3) 

can now be applied. The process will always converge; though often, 
as we shall see (§ 6) by an example, with disappointing slowness. 
Naturally, if we knew beforehand the approximate location of the latent 
roots of D~ X A, and especially of the smallest root, we could in most 
cases make a better change of matrix origin than \nl\ but such knowledge 
is not usually precise enough. 

Let us next consider the Seidelian case. The property of convergence, 
in the case when A is positive definite, is perhaps most easily deduced 
indirectly, from the fact that Ax = h can be regarded, and in infinitely 
many ways, as the normal equations of some linear least-square problem. 
For we have A — M'M, | M | * o, a resolution which is possible in infinitely 
many ways, since HM y where H is an arbitrary orthogonal matrix, can 
replace M here; and then the equations M'Mx=h arc the normal 
equations corresponding to the "observational" equations Mx=(M')~ 1 h. 

These normal equations arise from the minimizing of the definite 
quadratic form s* = (Mx-k)'(Mx-A), where h=(M')~ 1 h, and this is a 
sum of squares which can be transformed to a different sum of squares 
(Whittaker and Robinson, 1929, p. 255) by the classical reduction of 
Lagrange (Turnbull and Aitken, 1932, p. 83), in which, if we gather all 
terms in x x into a squared term, we obtain 

f*-fl < - 1 (flll* 1 +«7l t * l + . . . .... 


( 4 ) 



56 A. C. Ait ken y Studies in Practical Mathematics 

every term later than the first being free of x v To annul the squared 
residual bracketed in the first term is to reduce $*; and such annulling 
is the typical single operation in any phase of Seidelian iteration. The 
point is, that the annulment is effected by modifying x x only, and so the 
later terms in s % are unaffected. Each operation of Seidelian iteration 
thus reduces s *; and so the vector of values {x } x t . . . converges, 
as desired, to those values that minimize j*. It follows, a posteriori , 
that the latent roots A, of the Seidelian operation must be such that 
| X ( | < i; but, as has been seen from the simple example of § 2, they 
may be complex. 

The Seidelian operation can be described thus. Take d it =au as 
before. Let C be the lower triangular matrix c ti =o t c iS =a ijt i>J- 
Then A—D-C-C\ the iterating matrix is (D-C)~*C\ and the char¬ 
acteristic equation of the iteration is | \(D-C)-C' |=o. We learn 
therefore that if D - C - C* is positive definite, then the roots of the above 
equation are such that | A | < 1. 


4. The Acceleration of Convergence 

The successive vector increments are derived, as we have seen in § 2, 
by the recurrence 

#(*+1 ~ x {t = B~*C{x {t - x if - 1 }. (1) 

We are therefore on the familiar and well-explored ground of the repeated 
matrix operation on a vector, and we know that in the case that is being 
considered 

x iM - x {i - X x {x lt - x"- 1 } (2) 

will hold with greater and greater approximation, and the more if 
I ^1 I/I ^1 I ls small. In practice it is advantageous to form the vector- 
difference at the earliest possible stage and to use it as operand, thus 
dismissing h from consideration. We shall therefore use 

BAx Ki - CAx << - 1 , where Ax (t -x lt , (3) 

as the recurrence. When the iterated differences are adequately small, 
wc cumulate them upon x {0 \ and the errors of rounding-off likely to be 
incurred in such a cumulation can be obviated by retaining one or two 
additional digits, the customary expedient for c< stabilizing 11 any calculation. 

In this positive definite case, however, provided that Aj is real, it 
will usually not be necessary to continue the iteration until the vector- 
differences are as small as desired; it will be enough that the corresponding 



V . On the Iterative Solution of a System of Linear Equations 57 

elements in consecutive vector-differences should begin to show an approach 
to a geometrical progression, which will be of common ratio The 
accelerative methods are then available. Suppose in fact that and 

Ax t u are two such corresponding elements; write A*x/ t ~ 1 =Ax/ t - Ax^ % ~ x , 
and form the quotient 

{AxM'/AW-K ( 4 ) 

This (Aitken, 1925, p. 301; Holme, 1932; Stcffcnsen, 1933) will approxi¬ 
mate to the remainder term; it should be applied, therefore, to each x { 
in the vector of solutions derived by cumulating up to x (t f and the resulting 
vector can then be tested by further Seidelian iteration. A somewhat 
less accurate remainder term is given by the quotient 

(5) 

and this will serve sometimes as a check. 

5. A Modification of Seidelian Iteration 

It has been established that in the symmetric and positive definite 
case Seidelian iteration converges. If a type of Seidelian iteration could 
be devised such that all the roots of the characteristic equation were real , 
the accelerative methods would in every case be available, and would 
enhance the already existing advantage of convergence. We suggest 
therefore the following procedure. 

As before, we have A—D-C-C\ positive definite. Let us begin 
Seidelian iteration as usual, obtaining from an initial vector x <0 the 
improved values of the unknowns, x x (l f x t (1 , . . x n {1 . Then, instead of 

beginning (as is the usual procedure) a new cycle x x {% , x t {t > . . let us go 
back through the unknowns in reverse order , x£, v • • •» *i (l =*i <3 ; 
then down again, x % { \ # 3 (a , . . x n { *=x n u ; then up again, and so on. 
This is a to-and-fro or two-phase Seidelian iteration, and the results are 
obtained in the order indicated below: e.g. 



*1 (1 

*1 ( * 

- *, (s 



I *. <l 

| aV* 

I 

t **'* 1 (x) 


| + 



1 + 

* 4 (0 

i 

1 * 

- **'• 

*«<* 

- *4 u i 


The two stages, down and up, are to be regarded as the complementary 
halves of a double Seidelian operation, rather like the two half-oscillations 
of a complete oscillation; so that it will be useful to speak of elements 




58 A. C. A it ken. Studies in Practical Mathematics 

Xi if t x 4 iM , Xi 1 ** 4 , ... as being "in phase". Forming therefore the 
differences of elements in phase, let us say 

dx 4 {i (a) 

we can regard the corresponding vector-differences as operands, and the 
complete operation is then characterized by the matrix 

(P-cy-'C.iD-cy-tc. ( 3 ) 

It will now be shown that every latent root of this matrix is real and 
such that o < A < i. For in the first place | A | < i, since each individual 
operation is of Seidel ian type. Next, we have 

(Z> - C')~ l C . {D - Q-'C - />-*{(/ - Xy i £(I - jo- 1 *’}#, (4) 

where K — D~^CD^. This again, since K and (/-AT) -1 are permutable, 
is similar to ( 7 - KY x KK f (I- Af')~\ namely a non-negative definite 
matrix of form M'M and in general of rank n — I, since K is in general 
of rank «- 1. We conclude that o < A < 1. The effective positiveness 
of the roots confers some arithmetical advantage. 

It may be of interest to derive the above result from first principles. 
To each of the latent roots A there corresponds a non-trivial characteristic 
vector q such that 

( 5 ) 

Hence q’KK’q - A q\I-K)(I- K*)q , 

that is, X-q’XX'q/iq'il-X-X^q + q'XX'q}. (6) 

The quadratic form q\I - K - K')q is positive definite, by hypothesis, 
and q'KK'q is non-negative definite. It follows that o < A < 1. 

In the practical technique we begin with x (0 , form x tl with the down¬ 
ward operations and then x {% with the upward ones. Constructing then 
the vector-difference x (t -x i0 , we take it as operand for the further 
downward and upward operations. When convergence seems adequate, 
we can cumulate in two ways, first upon x {0 with all vector differences 
that are in phase with it, and again upon x il with all vector-differences 
in phase with it; these two vectors of iterated solutions can be used to 
check each other. 

6 . Numerical Examples of the Various Iterations 

To solve 


N m 

3* 17 0*93 -i'07 1*13 


*1 


8*o8 

0-93 3-86 -0-89 -0-77 


*1 


6*33 

-1-07 -0-89 5-14 1*79 


x » 


5'58 

1-13 -0-77 1*79 6*33 

m a 


*4 


11*05 



V. On the Iterative Solution of a System of Linear Equations 59 

beginning with values 1*5, 1-5, ro as first approximations for x %9 x 9t x A . 
The usual Seidelian iteration gives the first iterated vector {2-2634 1*6432 

1*4931 1*1372}. Forming the first vector-difference, we continue: 



-928 

-479 

-IS* 

-52 

-37 

2*0997 

Check 

8-0796 

1432 

479 

58 

l S 

4 

1 

1*6989 

6*3200 

6 9 

-588 

- 328 

- 85 

-29 

-IS 

1*3986 

5*5797 

1372 

396 

160 

54 

18 

9 

1*2009 

11-0496 


Stopping at the stage where each entry seems to be about one-third 
of the corresponding entry in the column before, we apply the corrections 
of §4 (4), for example -52 i /99 = -27, and cumulate upon the earliest 
values 2*2634, l' 5 » *‘5 and i*o. The check suggests that the solutions 
have at most small errors in the 4th decimal place. The accurate values 
are in fact {2*0999 1*6989 1*3987 1*2009}. 

The new variant of the Seidelian iteration gives the following opening 
values, down and then up, after which we continue with vector-differences 
in phase: 



2-2634 


~ 1132 


-347 

- 108 

-49 

2-0998 

**5 

1-64321-6579 

1579 

417 

285 

98 

86 

37 

1-6987 

>'5 

1-4931 1-4453 

- 547 

-643 

-317 

- 209 

-102 

-48 

1-3986 

1*0 

I-I 37 * 

137a 


442 


135 

59 

1-2008 


For example, 1*5+0* 1579 + 0*0285 + 0*0086 + 0*0037 - 1 *6987. 

Here there is little to choose between the old and the new Seidelian 
iteration, for it so happens that the largest characteristic root of the former 
is real and fairly small. 

Simple iteration, on the other hand, when performed with the change 
of matrix origin mentioned in § 3, so that for example the first equation 
of iteration is 

6 * 34 *i u “ 3 91 7 *i (0 “ °*9 2 X t {0 + 1 'ojxj 0 - 1 * 1 W°» 

converges, but so much more slowly than the Seidelian iterations that it 
seems unnecessary to give the details. In particular, if we begin with 
the initial values {2*0 1*5 1*5 i*o}, the successive increments for x t 

take a long time in settling down to an approximate geometrical pro¬ 
gression, of common ratio about 0*75. 



60 V. On the Iterative Solution of a System of Linear Equations 


REFERENCES TO LITERATURE 

Aitken, A. C., 1925. u On Bernoulli's Numerical Solution of Algebraic 
Equations", Proc . Roy . Soc. Edin xlvi, 289-305. 

-» 1937- “The Evaluation of the Latent Roots and Latent Vectors of a 

Matrix", Proc . *Sw. Edin. } lvii, 269-304. 

Bodewig, E., 1947. “Bcricht liber die verschicdcn Methodcn zur Ldsung eines 
System linearer Gleichungcn mit reellen Koeffizicnten", Indag . Math,, ix, 
441-452, 1104-1116, 1285-1295; x, 53-64, 211-219. 

Frazer, R. A., Duncan, W. J., and Collar, A. R., 1938. Elementary Matrices, 
Cambridge University Press. 

Steffensen, J. F., 1933. “Remarks on Iteration", Skandinavisk Aktu- 
arietidskr ., 64-72. 

Turnbull, H. W., and Aitken, A. C., 1932. Theory of Canonical Matrices, 
Blackic & Son, London and Glasgow. 

Whittaker, E. T., and Robinson, G., 1924. The Calculus of Observations, 
Blackie & Son, London and Glasgow. 


(Issued separately June i6 f 1950) 



Transformations asymptotiquement presque piriodiques discontinues 61 


V.— Les transformations asymptotiquement presque pirio¬ 
diques discontinues et le lemme ergodique. (Premi&re 
Note.) Par Maurice Frichet, Hon.F.R.S.E., University de Paris, 
k la Sorbonne. Communicated by Sir Edmund WHITTAKER, 
F.R:S. 

(MS. received October 4, 1948. Revised MS. received April 8, 1949. Read May 2, 1949) 


Synopsis 

With the aim of establishing, under wide conditions, the ergodic theorem of G. D. 
Birkhoff, the author extends the class of asymptotically almost-periodic functions, con¬ 
sidering now not only continuous functions, as he had already done in 1943, but dis¬ 
continuous functions. Definitions and properties of the extended class of functions are 
set out, some comparisons being made with almost-periodic functions in the sense of 
Bohr, Stepanoff, Weyl and Besicovitch. Applications to the ergodic theorem are 
adumbrated. 


Introduction 

Nous avons montrd prdeddemment * comment Introduction dcs fonction9 
A.P.P.C. (asymptotiquement presque pdriodiqucs continues) permet de 
simplifier considdrablemcnt—et par une methode applicable en bien 
d’autres questions—la ddmonstration du lemme ergodique de George 
Birkhoff. En montrant que la fonction f(T t M) qui y intervient appartient 
k la classc des fonctions A.P.P.C. beaucoup plus restreinte que cellc des 
fonctions qui ont une moyenne k l’infini, notre mdthode fournit en outre 
un rdsultat beaucoup plus prdcis que le dit lemme. Elle permet enfin 
d’dtendre la validity de ce lemme dans une premiere direction en supprimant 
plusieurs des hypotheses sur lesquelles dtait fonde ce lemme. II est vrai 
qu’en meme temps notre mdthode en rcstreignait la validite dans une 
seconde direction, en introduisant certaines hypotheses de continuity non 
exigdes par ce lemme. 

Mdme si Ton s'en tenait li, notre mdthode fournissait des rdsultats 
nouveaux. Et meme si elle n'avait apportd qu’une autre forme de 
ddmonstration, elle aurait conservd quelque intdret. 

Mais nous avions ddji, k cettc dpoque, indiqud que nos hypotheses de 
continuity ne jouaient pas un rdle esscntiel, qu'elles rendaient seulement 
plus simple la prdparation k la ddmonstration, prdparation consistant dan9 

* “Les fonctions asymptotiquement presque pdriodiqucs et leur application au 
problfcme ergodique”, Revue Scientifique , 79*°*’ Annde, 1941, pp. 341-354 et 407-417. 



62 


Maurice Frichtt, Les transformations asymptotiquement 


Titude des propri£t6s, utiles en elles-mfimes, des fonctions A.P.P.C. Nous 
avions meme pens£ pouvoir 6tendre (aux pages 416-417 de notre M6moire), 
notre demonstration au cas discontinu. Mais notre extension se basait sur 
un enonce de W. H. Young que nous avions reproduit de m&noire inexacte- 
ment commcs nous l'a fait observer M. Ky Fan. L’extension ne pouvait 
done &tre consicterie commc prouv^e ainsi que nous l’avons fait observer 
dans un m&noire ultirieur.* 

Nous allons done reprendre la question et montrer comment on peut 
g£n£raliser les fonctions A.P.P.C. et se servir de fonctions asymptotique¬ 
ment presque pdriodiques mais discontinues pour £tendre la validity de 
notre demonstration. II suffit pour cela d’appliquer k une telle generalisa¬ 
tion Tesprit des indications qui se trouvaient deji aux pages 343 et 344 de 
notre premier Memoire cite ci-dessus. 

Nous commencerons k nous placer dans un cas tr£s general que nous 
particulariserons ensuite quand cc sera necessaire ou au moins plus 
commode. 

Transformations asymptotiquement presque p£riodiques. 
Convergence au sens Z r 

Considirons une famille de transformations N=Q>(t) d’un nombre 
riel t en un il^ment N d’un ensemble E d’iliments de nature quelconque, 
chacune de ces transformations n’itant d'ailleurs nicessairemcnt difinie 
que sur une “demi-droite positive" t > a,-f iventuellement variable avec la 
transformation. Supposons en outre, qu’en considirant comme support 
d’un espace fonctionncl & dont chaque <!>(/) est un point, on traite «5 comme 
un "espace Z’’, e'est k dire un espace ou l’on ait difini la limite d’une suite 
de tels "points”. Plus pricisiment, k chaque demi-droite positive t > y, 
on associe une definition Z T de la limite, sur cette demi-droite, d’une suite 
d’iliments <!>,(/) de iJ^vers un element <D(/) de <^T (On suppose: i° que si 
sur t>y, 0 * tend vers 0 au sens Z,, il en est de meme de toute suite 
extraite de la suite des 0„ et 2° que si les fonctions ^»(l) sont identiques sur 
t>yk une fonction de ^ alors elles convergent vers ifi au sens Z r ) 
On suppose, en outre, les Z, telles que si y' > y, toute suite 0 „(/) convergeant 
vers 0 (/) sur / > y au sens L v converge aussi vers 0 (/) sur t > y' au sens 
Z,<. Par exemple, en prenant pour E l’ensemble des nombres riels, on 
pourrait prendre pour Z T la convergence presque partout sur t > y, de 
sorte qu’il n’est pas nicessaire de supposer 0(/) continue. 

* "Sur le problime ergodique", Revwt Seientifiqut, 8i e Annie, 1943, pp. 115-157. 

+ Voir le premier article rit< ci-dessus, pour les raisons qui nous font considinr ie 
une demi-droite au lieu d’une droite entifcre et un peu plus loin une certaine limite quand 
un nombre K tend vers + 00 au lieu de ± 00 . 



presque piriodiques discontinues et le lemme ergodique 63 
Transformations asymptotiquement presque piriodiques 

A DROITE AU SENS L 

Ceci <tant, nous dirons qu’une transformation F(t) de est asymp- 
totiqucmcnt presque piriodique k droite * au sens L, si quelle que soit la 
suite a de nombres h n tendant vers 4 - 00, il existe une suite de nombres 
i nt extraite de o et une fonction p(t) de JF "telles que, quel que soit le 
nombre y, F(t + k n ) converge vers p(t) au sens L y sur t > y. (Ceci suppose 
done implicitment que p(t) est ddfini quel que soit le nombre t.) 

Definition equivalents .—II est d'ailleurs possible de mettre cette 
definition sous une forme 6quivalente (quoique en apparence moins stricte), 
et surtout plus commode pour certaincs demonstrations. Pourqu'unetrans¬ 
formation N—F{f) de qF ~soit asymptotiquement presque plriodiquc au 
sens Z, il faut et il suffit que, quel que soit le nombre a et quelle que soit la 
suite 0 des h n -► ®, il existe une suite de nombres ^ extraite de 0 telle 
que la suite des F(t^J^) soit convergente au sens L a sur / > a. 

La condition est ividemment nicessaire. Pour prouver qu’elle est 
suffisante, utilisons le proc£d£ de la diagonale. Si la condition est remplie, 
appliquons-li pour a = -1; posons et soit p { ~ 1 } (f) la limite au 

sens des /^Z + Z* 1 *). Appliquons encore la condition pour a= -2, 
mais en rempla^ant les h n par les l£ } ; on en extraira une suite de nombres 

et la suite des F(t + Z***) convergera au sens Z_ t vers une transfor¬ 
mation D’aprfes ce qui prdc&de ^ (_1) (Z) coincide avec p { ~ % \t) pour 

t> -1. On continuera ainsi et on formera des suites de nombres Z< r) 
extraits de la suite des Z£ r_1) et tels que les F(t + l%*) convergeront au sens 
Z_ f sur t> -r vers une transformation p { ~ r) (t). Soit alors &***£*\ 
la suite des k n est extraite de la suite des h n . Et comme k partir du rang r, 
elle est aussi extraite de la suite des Zj; f) f la suite F(t+Jk n ) converge au sens 
Z_ r vers p iri (t) pour t> -r. D'ailleurs, appelons p(t) la fonction qui 
coincide avec P { ~ r) (t) pour -r < t <-r+ i et avec p { ~ x \t) pour t > - 1; 
elle coincide alors aussi pour toute valeur de t > - r avec p { ~ r) (t ]). Soit 
maintenant a un nombre quelconque; il y a au moins un entier positif r 
tel que -r < a; alors F(t+k n ) convergeant au sens Z_ r vers p { ~ r) (t) sur 
t > - r, converge au meme sens vers p{t) sur t > - r et par suite converge 
au sens Z u vers p(t) sur t > a: La condition admise est done bien 
suffisante. 

Remarque .—La definition que nous venons d’adopter est la plus commode pour le cas 
actuel. Mais, au point de vue intuitif, elle a 1’inconvenient de ne rien faire apparaftre 
qui justifie Pexpression 14 presque periodique". En particularisant ce qui pr&lde au cas 
oil la limite au sens L peut f tre exprimle au moyen d’une distance, on peut, comme nous 
le ferons ailleurs, prouver l’existence des "presque-p^riodes”. Et l’hypothfcse d’une 
distance permettra m£me d’ajouter des propriety supplemental res & celles qui seront 
etablies ici. 

* Pour abr^ger, nous supprimerons dans la suite les mots u k droite". 


64 Maurice Frichet % Les transformations asymptotiquement 


Generalisation. —Soit ^(/) = une operation fonctionnelle faisant 

correspondre k des transformations N = <!>(/) de la famille l’on a 

d£fini la limite au sens Z) des transformations P = tfr(t) appartenant k une 
famille ofE' (ou Ton a ddfini la limite en un certain sens Z'), P appartenant 
k un ensemble E 'dAliments de nature quelconque (. E 9 pouvant 6tre ou non 
distinct de E). Nous supposerons de plus que, pour toute constante 
positive A, on a A)] +A).* 

Condition D.~ On dira que l*op< 5 ration U est continue si, quand 
de o^tend au sens L vers <&(/) de ^ {/[<!>„(/)] tenc * au sens L* vers 

£/[<!>(/)] de Quand il en est ainsi, Topiration U transforme alors toute 

transformation Q>(f) qui est asymptotiquement presque p6riodique au sens 
L cn une transformation ift(t) qui est asymptotiquement presque plriodique 
au sens Z'. Car si h n —► -f on peut cn tirer une suite k u A lf . . . telle 

que converge au sens L vers une fonction p(t ) de et 

alors *f/(t + k n ) = U[<t>(t +■ A*)] converge au sens L' vers une fonction 
U[p(t)\ de 

En particulier, soit P=f( 2 V) unc transformation de points N de E en 
points P de E 9 ; /(®(/)) est une transformation de nombres t en points de 
E 9 et par suite/( 0 (/)) ddfinit une operation £/[<!>(/)] d'une espfece particu- 
liire. II se peut qu’elle soit continue au sens indiqu6 plus haut et alors 
si ®(/) est asymptotiquement presque piriodique au sens Z,/(<!>(/)) le sera 
aussi au sens Z\ 

Applications aux familles de transformations ponctuelles 

Soit d’abord 5 un ensemble bom£ et fermi de points d’un espace 
euclidien ou plus g^n^ralement un ensemble compact en soi f 5 formd de 
points d'un espace topologique E ou unc limite est d&inic, satisfaisant aux 
conditions i°, 2° de la p. 62. Considirons maintenant une transformation 
N= T t (M) d'un point arbitraire M de 5 en un point correspondant Af de S 9 
cette transformation dependant du param&tre numlrique / suppose 
essentiellement > o. Nous ferons les hypotheses suivantes:— 

(A) Pour chaque M fixe de S, T t (M) appartient k la famille 

(B) On peut alors considirer aussi T % (M) comme un point de l’espace 
fonctionnel &, point dependant du point M . Nous supposons que e’est 
unc transformation cor\tinue au sens Z, par rapport au point M de S* 

* C'est, par cxemple, ce qui a lieu quand les transformations <D(/) sont des fonctions 
numdriques, chacunc absolument inUgrablc sur son domaine de definition et qu’on prend 

£/[ 4 >(/)] = | <b{e)dz. C'est aussi ce qui a lieu dan9 l’exemple considlrl plus loin, ou 

uim^rnt)). 

t Un ensemble .S' est compact en soi lorsque dans toute suite de points M % de 5 , il existe 
une sous-suite de points M*,, . . .qui converge vers un point de S. 



65 


presque pModiquis discontinues et le lemme ergodique 

C'est it dire que si M n de 5 tend vers M de S, T t (M n ) consid6r£ comme 
fonction de t converge vers T t (Af) au sens L 0 sur / > o. Nous supposerons 
en outre que quel que soit A, T t+h (M^) converge vers au sens L„ k . 

(C) On a 

pour tout couple t > o, f > o. 

Sous ces trois hypotheses, T t (Af) est une transformation asymptotique- 
ment presque pdriodique au sens L. En effet, soit A l( A,, . . . une suite 
de nombres tendant vers + oo t et a un nombrc quelconque. On peut 
trouver r tel que A r + a > o et A r > o et trouver v tel que h n - A r > o pour 
n > v. On peut alors ^crire pour n > v et / > a 

f) - Ti+kr(T'f lm _ kT (Af)). 

La suite des points appartenant k S contient une sous-suite 

convergente de points 7' jB _* f (A/); soit sa limite qui appartient h S. 
Pour / > a, / + A r sera > o et la suite de transformations 7"| +ln (J/)convergera 
au sens L a vers la fonction 

Revenons maintenant h Top^ration U considirie ci-dessus, faisant 
correspondre k une transformation N=F(t) de une transformation 
de e^'et vdrifiant encore la condition 

(K) U\Fty +A)] —$/+A). 

Si cette operation est continue au sens indiqud plus haut, alors U[T t M] 
est une transformation appartenant k et qui, dans les hypotheses 

(A), (B), (C), (K) sur T t Af , sera aussi une transformation asymptotique- 
ment presque p^riodique au sens V. 

Moyenne A Pinfini .—On sait que les fonctions presque pdriodiques, 
au sens de Bohr,—qui sont continues—, ou aux sens de Stepanoff, de H. 
Weyl ou de Besicovitch,*—qui peuventfitre discontinues—,ont une moyenne 
k l’infini.-f II en est done de mfime des fonctions asymptotiquement presque 
pdriodiques correspondantes. D'autre part, sur les quatre definitions 
correspondantes, les trois premieres sont des cas particulars J de la 
ndtre ci-dessus. Nous allons faire une hypothise suppldmentaire qui peut 
done sflrement etre vdrifiee—et qui Test meme dans trois cas qui sont 
parmi les plus simples et les plus importants—en nous placant maintenant 
dans le cas oil: 


* Voir pour leurs definitions, Lemons sur les fonctions presque piriodiques par Favard. 
Gauthier-Villars, 1933. 

+ Definition plus loin, p. 66. 

I A. S. Voir Kovanko, “ Sur la coznpacite dei systfcmes de fonctions presque periodiques 
generalisees de H. Weyl”, C.R. Acad. Sc. U.R.S.S. , xliii, 1944, 275-276. 
f.ils.i.— vol . lx 111, a , 1949-50, fart i 


5 



66 


Maurice Frichet, Les transformations asymptotiquement 


(H) La definition de la limite L* dans la famille est telle que toute 
transformation asymptotiquement presque plriodique au sens L\ soit 
a une moyenne k l'infini. 

Cela suppose d’abord que la definition de la moyenne k l'infini peut 
avoir un sens dans le cas oil E et E* sont de nature quelconque^ 

Nous dirons que *ft(t) de a une moyenne k l'infini (4 droite), quand 
il existe au moins un nombre /? (sur la demi-droite t > a oil est defini) 
tel que le rapport 

^ m* <*> 

tende vers unc limite determinee quand K-+ -h «. Comme ici N=tf/(t) est 
un point de l'ensemble E\ dont jusqu’ici nous avons suppose les elements 
de nature quelconque, il faut d'abord donner un sens k l’integrale et k la 
limite. On peut lui en dbnner un quand E ' est un espace vectoriel 
distancie, dit aussi de Banach-Wiener. 

Pour cet espace, la generalisation de l’integrale classique au sens de 
Cauchy est immediate, comme limite unique, si elle existe, de sommes de la 
forme classique 

» ( -u. 

iml 

9 

Mais il est plus utile de g6n6raliser 1 ’integrale de Stieltjes-Lebesgue en 
appliquant au cas actuel la definition plus generate encore que nous avons 
donnee ailleurs.* Il nous suffira de retenir que, quand cette integrate 
existe, 

(i°) elle est la limite unique quand c —► o de sommes de la forme 

(mesure e t ) 

oil l’on a ddcomposd (j8, /3 + K) en une suite finie ou ddnombrable 
d’ensembles mesurabies disjoints e it dans chacun desquels 
"l’oscillation” de 4 »(/) est < t, oil £< est un point arbitraire de e { 
et oil s est supposd absolument convergente; 

(2°) cette integrate possdde celles des propridtds classiques de 1’intdgrale 
ordinaire que nous aurons i utiliser dans la suite. 

* “ L’intdgrale abstraite d’une fonction abstraite d’une variable abstraite et son 
application it la moyenne d’un dldment aldatoire de nature quelconque”, Revu* Scientifique, 
82* Anode, 1944, pp. 483-512. 



6 7 


presque pbiodiques discontinues et le lemtne ergodique 


Cos car listen .—Dans lc cas oil E est un espacc cartdsien k un nombrc 
fini, r , de dimensions, tout point de E peut etre dlfini par une suite ordon- 
n£e d'un nombre fini fixe, r, de nombres appeles coordonn&s de ce point. 
Alors on voit que le rapport (i) repr^sentera le point de E dont les 
coordonn&s sont 



oil est la k § coordonn^e de * 

Nous voyons que p(t) a une moyenne k l’infini si les rapports (2) tendent 

vers des limites respectives quand K —► 00 , limites que nous ddsignerons par 

2 Rif/ k (t). Ainsi, la moyenne k l’infini du point mobile if/(t) de E sera lc 
+« 

point de E % d<£signd par SW^(/), qui a pour coordonn6es les Vlipdf) et on 

+ 00 +«5 

pourra (‘crire 


1 C P * K 

lim -I (3) 

+ 00 K — 


Cas de Vespace de Banach-Wiener .—La meme notation sera employee 
quand E' est un espace vectoriel distance. II n’est pas ndcessaire de 
faire intervcnir dans le symbole du premier membre la quantitd j9 qui 
figure au second membre. Car si la limitc (3) existe pour une valeur de /9 
(telle que 1 (i(t) soit ddfini au moins pour t > p) elle existe aussi (et a la m£me 
valeur) pour toute autre valeur y(> a) de p. On peut en effet dcrire 


1 r y+K 1 f 

x) W)dt 


y-p + X 1 C +Jr 


K y-P + XJg 


■f. 




et le second membre tend dvidcmment vers le meme point de E' que le 
second membre de (3). 

Remarque. —On appelle souvent SR^(/): moyeune de ^(/). Nous prtferons I’appelcr 

+ 00 

moyenne k l’infini (4 droite) pour bien marquer ce fait—qui r&ulte de ce qui pr£c6de—que 
l'existence et la valeur de cette moyenne ne sont pas modifies quand on remplace f(/) sur 
un segment fini quelconque par une autre transformation tfgalement quelconque, pourvu 
qu'elle soit, comme +{t) int^grablc sur ce segment. Autrement dit, l’existence et la 
valeur de SWlK0 ne dependent—si l’on peut s’exprimcr ainsi—que des valeurs de *ftt) k 
+« 

l’infini (4 droite). 

Application au lemme ergodique .—Tout ceci £tant, il r&ulte de ce qui 
priefede que si les definitions de la limite aux sens L et L\ celle de la 
transformation T t M et celle de l'opdration U, satisfont aux conditions 

* Ce langage suppose implicitement que ^(/) soit intdgrable sur tout segment fini de 
t > a, (c. a. d. qu'il en soit ainsi pour ses coordonntfes). 



68 Transformations asymptotiquement presque piriodiques discontinues 

(A), (B), (C), (K), (H), alors U\T t M\ a une moyenne d Vinfini . Mais 
nous avons, dans ce qui pr£c£de, d£montr6 plus que cette propri6t6 
(qui constitue le r&ultat du lemme ergodique), puisque nous avons mime 
prouvd que U[T t M] est une fonction de / qui est asymptotiquement 
presque p&iodique au sens L\ 

Remarque .—Dans un mimoire ultirieur, nous indiquerons les propri 4 t£s supple 
mentaires qu’on obtient dans le cas 0(1 les limites L y L* peuvcnt fitre d^finies par l’inter- 
m6diaire d’une distance et enfin nous appliquerons ces r6sultats au cas special considirt 
par George Birkhoff. 


{Issued separately June i6 ( 1950) 



Unbiased Statistics with Minimum Variance 


69 


VI.— Unbiased Statistics with Minimum Variance. By A. 
Bhattacharyya, Statistical Laboratory, Calcutta. Communi¬ 
cated by Professor A. C. AlTKEN, F.R.S. 

(MS. received April 26, 1949. Read July 4, 1949) 


Synopsis 

The problem considered is that of the estimation of a statistical parameter from a 
sample of values of the variate or variates concerned. Reference is made to the method 
of unbiased statistics with minimum variance, developed by Aitken and Silverstone. 
The principal result obtained by these authors is generalized, and an inequality involving 
the variances of unbiased statistics is obtained. Several examples illustrating the theory 
arc appended. 


I. The problem of estimation has been approached from various angles. 
Of these, perhaps the earliest was that of " unbiased statistics with minimum 
variance”. The method of least squares of Gauss and that of linear 
estimation of Markoff were based on this concept. Fisher (1921) laid the 
foundation of his theory of estimation on this concept—his criterions of 
consistency and efficiency are nothing but large sample set-up of that 
of unbiasedness and minimum variability. Aitken revived the problem 
again and, in collaboration with Silverstone (1941), made an attempt to 
arrive at some general solution. The last-named author showed, with 
the help of the Calculus of Variations, that under some regularity con¬ 
ditions and in the case when the ranges of the stochastic variables are 
independent of the parameters, a statistic /*(*,, x t , . . ., x H ) is an unbiased 
estimate with minimum variance of the parameter 0 S if 


t} - Oj - v 


d\o gF 


dB< 


where F(x u z a , 


#1,01, Bi)dx x dx % 


. dx n is the joint 
probability distribution of the stochastic variables, x lt x it . . ., x nt 
t s (x l9 x M , . . ., x n ) is a function of these stochastic variables alone (free 
from the parameters d lt 0 a , . . ., 6 {) f and A*( 0 Xf 0 *, . . ., 8 t ) is a function 
of the parameters 0 lt 0 $f . . ., 6 t only. The author in a scries of papers 
(1946-48) obtained a more general result, though the method followed 
by him in proving this was different from that of Aitken and Silverstone. 
It was shown that v under some general conditions similar to those stated 



70 


A. Bhattackaryya 


above, the statistic t(x lt x tt . . ., x n ) is an unbiased estimate with 
minimum variance of the function r{O x , 0 lt . . ., 0j) of the parameter if 



where the summation is over the j's, and the A’s are functions of the 
parameters 0 *s. It was further shown that in case a statistic satisfying 
(i.i) did not exist, even then it was possible to derive a lower bound of 
the variances of unbiased statistics estimating r. It will be readily seen 
that the result of Aitken and Silverstone is a special case of this general 
result. We now propose to generalize the result (i.i) still further and 
obtain a more extended result. In deriving this we shall follow the method 
of the Calculus of Variations, though the method already used by the 
author in the previous papers (namely, the method of regression) is also 
available and in some respects more suitable. Thus, it should be 
remarked that a result deduced by the method of the Calculus of Varia¬ 
tions is only valid when the (stochastic) variables arc continuous, whereas 
the result derived by us is valid both for continuous and discrete cases 
(as can be shown by deriving this by the regression method). 

2. Let the n stochastic variables x lt x t , . . x n (independent or 
dependent) follow the probability distribution 

^(*i. • • • > *nl 0i, 0i, • • 0 \)dx x dx t . . . dx ni (a.i) 

where 0 lt 0 tl . . ., 0 , are / parameters involved in the distribution. Let 
r( 0 lt 0 ,, . . ., 0 ,) be a given function of the parameters. The problem is 
to find a function of the stochastic variables t(x l9 x 9t . . ., x n ) such that 
its expectation is the given function r and its variance is a minimum. 
Analytically we have to find the function / such that 

jV - r) % Fdv - a minimum, 

subject to the condition 

(2.2) 

where for brevity dv has been written for the elementary volume 
dx x dx% . . . dx nf and the integral extends over all the variables over 
their entire range. As r is supposed to be given, our purpose is served 
if we can find a / such that 

j PFdv = a minimum, (2.3) 

subject to the same condition (2.2). The problem as stated in the general 



Unbiased Statistics with Minimum Variance 


7 i 


form has not been solved yet f and there may not be a general solution 
at all for all distributions. It may happen that very often there may not 
exist an unbiased statistic which satisfies (2.2), and in that case it is futile 
and irrelevant to proceed in this way. Or it may happen that in a particular 
distribution only some classes of functions of the parameters have unbiased 
estimates. Even then the following method may not lead to a statistic 
with minimum variance. In short, we do not claim to have found a 
necessary condition under which an unbiased statistic with minimum 
variance exists; what we have found is a sufficient condition which, if 
satisfied, leads to the determination of an unbiased statistic with minimum 
variance (provided it exists). 

To get a practical solution we assume that the limits of integration 
in (2.2) or (2.3) are independent of the parameters 0 lt 0 lt . . 6 t , This 

assumption regarding the independence of the range of the stochastic 
variables of the parameters severely limits the applicability of this method; 
but this restriction was assumed by Aitken and Silverstone also. Then, 
under some regularity conditions {e.g. differentiability of the frequency 
function F with respect to the parameters 0 ’s, uniform convergence of 
the integrals involving such functions, etc.), we can derive other relations 
from (2.2) with the help of integration and differentiation (including 
repeated process) with respect to the parameters 0 ,, 0 *, . . ., 0 t . For 
example, we can deduce the following relations:— 


0) J 00l'00Sr ■ . . dd*/ 9 

00 e t __ e t )Fd6^, 

(iiQ 


J #» + <•+•••+<! 
dv ~dfid6$ . . . 0 # T; 

dv -¥e;],W" e% '' ‘ 

dv — J j** h(0 Xy 0*, • • . > 0i)rd0id0 if 


where 0,°, 0>° are suitable constants, and A(0 lf 0*, . . ., 0 f ) is some suitable 
function of the parameters 0 lf 0„ . . ., 0j. 

The above are given by way of illustration, and the reader can easily 
construct more or less complicated examples. But these or similar 
expressions are generally of the form 

*.. *1. 8 U 0 „. . 0 ,)dv-a( 0 lt 0 t . 0 ,), (a.4) 

when the <D’s are some functions of the variables x u x t , . . ., x n and the 
parameters 0 lt 0 t , . . 0 t , which can be derived from F(x u x t , . . x n ; 

0 U 0 t , . . 0 t ), and the a’s are some functions derivable from 



72 


A. Bhattacharyya 


r(d u 6 t , . . 0,) without any prior knowledge of the actual statistic t. 

Thus in the above cases we have 




.. .+<, 


g|<l + <l+ ■ . * 

a “dB L 1 'dffi . . . Mf r; 


(l) 

( ii) t Wi> 9 * • • •• W** °" h{6 " B " ‘ ' *’ edrddr ’ 

(iii) <D-r P Affix, 0* . . 6 l )Fdd i d 6 h a-P P A(0„ 0» . . ., 0 drJ 6 J 6 ,. 

J#/ J#/ Ji/J»/ 


We may mention in passing that besides the use of the infinitesimal 
calculus, other calculus ( e.g . finite calculus) may be used to derive relations 
of this nature. 

Now we are faced with the problem of finding a statistic t(x 1% x ti ..., x m ) 
such that the integral (2.3) is a minimum subject to the restriction 
(2.2) and all restrictions of the nature (2.4) derivable from it. Intro¬ 
ducing Lagrangian undetermined multipliers A’s, we can reduce the 
problem to one of unrestricted maximum or minimum. Thus, multiplying 
each of the relations (2.4) and (2.2) by means of 2A and subtracting this 
from (2.3), we get 

J[AF- a XjF- 2/SAA]^ 

which has to be minimized. The Euler equation is obtained partially 
differentiating the integrand with respect to t and equating that to zero. 
Thus we get 


tF — \F — 21 A/Dj “ 0 

( 2 . 5 a) 


(2-S) 


Thus t is determined by (2.3). We get the interesting result that if in 
any distribution an expression of the nature of one given on the right 
side of (2.5) is free from the population parameters, then this gives an 
unbiased statistic with minimum variance for the estimation of its own 
expectation (provided the variance itself exists). To determine the A’s 
we can make use of the relations (2.4) and (2.2). Thus multiplying (2.5) 
by and integrating over the x' s, we get 

a t - - A, \ O t dv + ZA<(a.6) 

The integrals on the right-hand side involve the O’s which are all known, 



Unbias id Statistics with Minimum Variance 


73 


and so they themselves are uniquely known. We here do not enter into 
the question of convergence of these integrals, which are all assumed to 
be convergent. Incidentally we remark that this restricts the choice of 
the h* s in the illustrated cases of <D’s given above. 

The set of linear equations (2.6) involving the A’s gives us their values 
on being solved (the solvability of these linear equations is taken for 
granted). The variance of such a statistic, when it exists, is easily found. 
Thus from (2.5) we get, after multiplying both sides by / and integrating, 

j PFdv - AflT + E - Aqt + E A,o,, 

and so the variance of t is 


v(t) - (Ag - t)t + EA^. (2.7) 

3. The method followed above involved the use of the Calculus of 
Variations; so it is implied that the stochastic variables are continuous, 
or satisfy similar conditions under which the Calculus of Variations is 
applicable. Also, the theory of the Calculus of Variations is very abstract 
in nature and so it is very difficult to comprehend the meaning and scope 
of the result (2.5). The regression method used by the author in the 
previous paper is more simple and straightforward. Besides, this latter 
method is applicable for discrete or continuous stochastic variables. We 
can use this method to deduce the result (2.5) and thereby prove that the 
relation (2.5) gives a sufficient condition for the existence of an unbiased 
statistic with minimum variance. 

We have already remarked that, in the distribution, any function of 
the parameters may not possess an unbiased statistic. We have just now 
observed that a statistic which is of the form (2.5) may not exist, although 
there may be unbiased statistics. In the second case, though we do not 
get a decisive answer to our search for the unbiased statistic with minimum 
variance, a very interesting result can be obtained by the regression method 
about the lower bound of the variances of unbiased statistics. As this 
method has been discussed in the previous papers, we give the result 
without the proof: 

r(r)>(A 0 -r)r + EA,a„ 

where T is a statistic whose expectation is t. The sign of equality holds 
only when T is of the form (2.5), when it yields the unbiased statistic 
with minimum variance. 

4. Some simple examples arc considered below to illustrate the above 
theory. 



74 


A. Bhattackaryya 


A. The binomial distribution is first considered as an example of a 
uni-parametric discrete distribution. Here 

So we have the following results (which can be easily verified) 


(i) 


x 

- - * 

n 


*P + 


A* zQ L 

* F dp* 


which shows that xjn is an unbiased statistic for the estimation of p with 
minimum variance. 


*0 *) /(i -p) p(i -p)(i - a p) i dF p*(i -p) % i &F 

n - i n + n 1 ^ «*(# - ij Z 1 cj/fr 1 * 


which shows that 


M/<- 


i) is an unbiased statistic with minimum 


variance for the estimation of p(\ -p)[n (which is the variance of the 
estimate of p). 


(hi) 


* + r -W 0 ^ 


which shows that i /(x + r) is an unbiased statistic with minimum variance. 
The mathematical expectation of this is 

which reduces to [i -(i -p) n+l ]/(n +when r=i. This shows that 
(n + i)/(x +1) is a statistic with minimum variance for the estimation of 
l Ip with a negative bias equal to (i - p) n+l /p. This bias becomes negligible 
if n is sufficiently large or if p is very nearly equal to unity. 

B. The next distribution to be considered is the two-parametric normal 
distribution. The distribution function of a sample from a normal distri¬ 
bution with population mean m and population standard deviation o is 


^(*i» m, o) 


exp 


(V airo)' 





Unbiased Statistics with Minimum Variance 


75 


The following results can be easily verified:— 


(0 


Ex/ a* 1 dF 

n n F cm 


which shows that Exjn = X is an unbiased statistic with minimum variance 
for the estimation of the parameter m. 

. o* i 8 F a* i 8 *F 

(11) -— O’* H-•-"-. 

* - i * - i F da n(n - i) F dm * 

which shows that 2(x<- £)*/(» - i) is an unbiased statistic with minimum 
variance for the estimation of a * 

C. Two samples, x u x Xy . . ., and r/, , x w% of size n and 

are taken from two normal populations with zero mean and standard 
deviations a and a respectively. Then the distribution function of these 
two samples is 

J*{X\ y X Xf • ■ • j % n j &1 t **vi G ) 

_ i _ cx / 

W 27 Tff)"(V / 2TTa') n> CXP \ 20* 2o'* /‘ 

It can easily be shown that 


and hence 'Lx i , l'Lx i > * is an unbiased statistic with minimum variance. 
It is also seen that 


E<£x i *l'Lx i '*) * ,J [ ■~(a n a' n '~ t )da’ •» * - 

a" J ff n J Q da ft -2 a 1 

and 

.... i d / . * m ft o*\ , , «(w + 2 ) a 4 

ECLxflltx /*)*- ——a B <r " -— )do -- 7 - - — - 

a B ~*a n J 0 c?<7\ n -2 <j l J (n - a)(« -4)0 4 


From these we can get the variance. 

D. Let x and y be two correlated normal variables with zero mean, 
and let them have a distribution function of the form 


- y*) 4 exp { - $(ax* + gy*- tyxy)}. 

The distribution function of a sample of size n is of the form 


>1; *1. y%\ • • <*. ft y) 

- (air)-"(a/3 - y*)* exp { - J(aEV + /92V “ ay2*,y,)}. 




76 

It is readily seen that 


A. Bhattacharyya 


r* - (- - a ^ 4> yl) * l / ai3 “ ^' %iFdaJ ^ 

which shows that r* is a statistic with minimum variance. Now the 
mathematical expectation of r 1 is given by 

£(' J W(aj 3 jVp-y-f ^/9 

--(aj 9 1 dad ft 
4 J a 

+ ?^!Llll ) y. (a/3 _^ir r (ap-^' a dadp. 

4 J a 


It is possible to express these integrals in finite form, and so £(r*) can be 
expressed in a finite form. However, when n is large, this finite expression 
will contain a very large number of terms and so is not suitable. We can, 
however, find an approximate value in such cases. The behaviour of the 
integral of the form 

I S -^W-•/)-*dadb, 


when N is large, can be studied by means of the transformation 
w(a /3 - y 1 ) t/(a /3 ~ y*) 

a = a+ -—— and 6 =p-\ -——, where u and v are the new 


N 


N 


variables. We get 

Is- [ f" t(®j3 - y*) + uP(aP -y*)/W+ va(afi -y t )/JV+vv(af} - y*)*/.A *]' 1 
Jo Jo 


(»£-/)» 

2V* 


duiv. 


Hence 


NH#-W a S r[i + u$\N + va/N+uv(ap -y^/W*]" s dudv. 
Jo Jo 


Now as the integral on the right side is uniformly convergent for all 
values of a, j8, and N, and as the integrand uniformly approaches the 
value exp (-u/ 3 - va) for all values of a, / 9 , u, v as N tends to infinity, 
it is not difficult to prove that 

lim [N'iap-y*)*-*/#] -f f e-^di/dv-i/ap. 
jt-m Jo J o 



Unbiased Statistics with Minimum Variance 


77 


Now 


£(**) - *(aP - y*) J V. + — +3 V («fl - y*)*A 

4 i" 1,4 4 i + 


4 l a + V 


(n \' 

V 1 




So, as n becomes indefinitely large, we get E{r v ) -*■ A*/a /3 to the first 
order of approximation. 


REFERENCES TO LITERATURE 

Aitken, A. C., and Silverstone, H., 1941-42. “On the Estimation of 
Statistical Parameters”, Proc. Roy. Soc. Ed in.. A, lxi, 186-194. 
Bhattacharyya, A., 1946. “ On some Analogues of the Amount of Information 
and their Uses in Statistical Estimation'*, Sankkya , vm, 1-14. 

-, 1947. Idem., ibid., vm, 201-218. 

-, 1948. Idem., ibid., vm, 315-328. 

Fisher, R. A., 1921. “Mathematical Foundations of Theoretical Statistics”, 
Phil. Trans. Roy. Soc., A, ccxxii, 309-368. 


(.Issued separately June 16, 1950) 



7 » 


H. S. Ruse 


VII.— Parallel Planes In a Riemannian V a . By H. S. Ruse, 

The University, Leeds. 

(MS. received June 9, 1949. Read November 7, 1949) 

Synopsis 

In an earlier paper a description was given, in terms of classical projective geometry, 
of some of the properties of parallel fields of vector spaces (parallel planes) in a Rieman¬ 
nian V %t and a detailed analysis was made of the case n = 4. The present paper contains 
the corresponding formulae for any n t though omits their projective interpretation. A 
parallel /-plane is said to be of nullity q when the / vectors of any normal basis contain 
q null and f-q non-null vectors. The conditions of parallelism, namely that the co¬ 
variant derivatives of the basis-vectors should depend linearly upon these vectors, are 
examined for any/ and any q{<p), and attention is thereafter mainly confined to the cases 
(i) n even, y = - i,/ = 4« - 1 or (ii) n odd, q=*\(n -3)) ,/ = - 1)1 which possess 

exceptional features. In the former of these cases light is thrown upon the curious 
circumstance, noted in the previous paper, that the existence in a V k of a null parallel 
i-plane necessitates the existence of parallel planes other than its conjugate. For a 
general n similar situations arise in the cases indicated. 

This paper follows two others. One, by A. G. Walker,* initiated the 
theory of parallel fields of partially null vector spaces in a Riemannian 
V n , and the other, by myself,f developed some of the geometrical and 
analytical consequences of his theory, with particular reference to the case 
h= 4. The present paper gives the generalization to any n of results 
previously obtained for «=4. 

The terminology is that of the previous papers. The V n is assumed 
to be analytic and of fundamental tensor g (i , the letters i, j, k, . . . 
being used for tensor suffixes (range 1, 2, . . ., n). The inner product 
Xi Y* s guX* Y 1 of any two vectors is denoted by (X Y) or ( YX). 

This paper is concerned solely with fields of vectors and of vector 
spaces, but as a rule the word field will be omitted. So, for example, a 
parallel field of vector spaces, or, as Walker calls them, a parallel field of 
planes, will be referred to as a parallel plane , or as a parallel p-plane 
when its dimensionality / needs explicit mention. 

1. Parallel /-Plane in V n 

A necessary and sufficient condition (Walker, Theorem 3.3) that a set 
of / independent contravariant vectors A's (A,, . . ., A£) should form a 
basis for a (contravariant) parallel /-plane V* in V n is that their covariant 

# A. G. Walker, “On Parallel Fields of Partially Null Vector Spaces'*, Quart. Joum. 
Math. (Oxford), XX, 1949. > 35 - 145 - 

t H. S. Ruse, “On Parallel Fields of Planes in a Riemannian Space", Opart. Journ. 
Math. (Oxford), XX, 1949, 218-234. 



Parallel Planes in a Riemannian V, 


79 


derivatives Kj should be linear combinations of the vectors themselves, 
say 

Alj a (r.i) 

where the functions A ^ are covariant vectors of V nt a, 8 being scalar 
indices. The identities (i.l) are called recurrence-formula for the vectors 

K- 

If h where the scalars /f are such that det | /£ | *0, then the 
satisfy relations of the form (i.i) and constitute a new basis for the same 
parallel /-plane. The vectors of a basis may therefore be chosen, in an 
infinite number of ways, to be mutually orthogonal, and, when non-null, 
to be unit. Such a basis is called normal . For a given /-plane, the 

number q of null vectors is the same for every normal basis, and is called 

the nullity of the /-plane. It is easy to see that 

q < min (/>, n -/), (1.2) 

and hence that q < \n, equality being possible only when n is even. 

The null part of a parallel /-plane V' is the vector space (or rather 
field of vector spaces), having as basis the q null vectors of any normal 
basis of V*. It, too, is parallel (Walker, Theorem 3.1). When q= o, 
the null part of V” is the o-plane, and contains only the zero vector. 

The totality of vector fields which, at every point of V n , are orthogonal 
to the vectors of V* at that point, constitute the parallel (»-/)-plane, 
conjugate to V» (Walker, ibid.). It has the same null part as V p . 


2. The Parallel w-Plane: Quasi-Normal Basis 


The totality of contravariant vectors at any point of V n constitute the 
vector n-space at that point, and the field of these spaces over V n is the 
parallel n-plane V" of V n . By (1.2), the nullity of V" is zero. 

As a normal basis for the n-plane wc may take any orthogonal ennuple. 


Let such an ennuple be A' = (h\ .Aj,), 

where 


(A«A») = AjA M m e u S ai (e a — ± 1) 

(not summed for a), 

(2.1) 

and let 



A? = *.A a < 

(not summed). 

(2.2) 

Then by (2.1), 



(A-A0-8J. 


(*.3) 


The corresponding recurrence-formulae are 

Li _ jb Li 



8o 


H S. Rust 


In this case the coefficients are given by 

or by 
where 

y«.« = 

Yate being Ricci’s coefficients of rotation. 

If instead of taking the orthogonal ennuple A, v we take a quasi - 
orthogonal ennupU * consisting partly of null and partly of non-null 
vectors, we obtain what may be called a quasi-normal basis for V*. This 
may be done as follows. 

Let r be any positive integer < \n, and choose from among the h l u 
any two non-overlapping sets of r of them. Without loss of generality 
these may be taken to be the first r and the last r, so that the vectors are 
divided into three consecutive sets. Writing f R, S, R‘ for the three ranges 
of numbers indicated in the diagram 

(i, 2 . r) (r + i, r + 2, . . n-r)(n-r + i, .... a) 

ASH' 

we then have 

• 

a«l 

~ ^ + X, e a^aih ai + 2) / a"*a'<^a'y- ( 3 *5) 

a«fi lit a't R* 

Thus ./?, 5 , ./?' are respectively the first r, the middle a - ir, and the last 
r numbers of the set i, 2, . . n. 

For any given o, let a =o + « - r. Then when a « JP, clearly o'* J?', in 
conformity with (2.5). Let 

Va <«c*) (s.6) 

(not summed for a, a'), where <J = i or i[—V( ~ 0 ] according as *, = +1 
or - 1. Then the 2r vectors hi, A B . are replaced by the 2 r vectors 
[*. The latter, namely C“, C*i ■ • •» C rt » could alternatively have been 
denoted by £, (or, for that matter, by £,), with o' running over R 1 , but 

* Y. C. Wong, “ Quasi-orthogonal Ennuple of Congruences in a Riemannian Space", 
Ann, of Math., XLVI, 1945, 158-173. 

i The idea of representing suffix-ranges by single capital letters is due to A. G. Walker. 




Parallel Planes in a Riemannian V n 81 

the notation adopted, which is similar to that used by Wong, proves to 
be more convenient. 

It is obvious that the vectors ? a , £“*, may or may not be real. They 
are real only when each e a = + I and each e a . = - i. For a Riemannian 
space of positive-definite metric they will certainly all be unreal 
From (2.6), (2.5) and (2.2) we get 


gu “ £<*£? ++ tfkoj 

(aeR, oeS), 

( 2 . 7 ) 

where the summation convention now 

applies to Greek indices, 

each 

taken over its appropriate range. Also from (2.6), (2.1), (2.2) and 
it follows that * 

( 2 - 3 ) 


[(a,j8)c*], 

(a.8) 



(2.9) 

(f.A 0 )-o-(C“A 0 ) 


(2.10) 

(A. ao- s; 

[(a, t) c 5 ]. 

(2.11) 


Thus, by (2.8), the £'s and £’s are all null vectors, those of each set being 
mutually orthogonal. Also, by (2.9), each is orthogonal to all but one 
of the £’s, namely the one having the same index a. By (2.10), the f’s 
and C’s are all orthogonal to all the unit vectors A*. Equation (2.11) is 
simply (2.3) applied to the “middle” set of the vectors A*, and expresses 
the fact that they are all unit and mutually orthogonal. 

The 2 r null vectors £"*, and the (n- 2 r) unit vectors A', form a 
quasi-orthogonal ennuple of nullity f r, the vectors all being mutually 
orthogonal except each and the corresponding £“*. If n is even and r 
is taken equal to $«, all the vectors are null. 

With these vectors as a quasi-normal basis for V", the recurrence- 
formulae take the form 

+<%**, 

where (o, fJ) e R and (o, r) e S, the Greek indices in the coefficients 
A#, . . ., being written in the covariant or contravariant positions 
in such a way as to conform with the Greek indices of the £’s, £’s and A’s. 

From (2.8), . . ., (2.11) we quickly find that and C°j are both 

* The notation (a, P )« Jt is an abbreviation for “ae^ and p e R”. 

t Wong, loc. tit. This use of nullity is different from that of § i, where it was used 
in connection with parallel planes. The latter meaning is the one to be understood in 
this paper, except in the present instance, which is the only occasion upon which the 
word is used in Wong’s sense. 

PsXiS'K.-—VOL* LXUI, A, 1949-50, PAHT I 


6 



82 


H. S. Ruse 


skew in a, / 3 ; that — -A^\ that — -G$\ that Q cU = - Rpaj\ an< * 
that is skew in a, t. Thus the recurrence-formulae for the parallel 
n-plane, which are in fact identities satisfied by the vectors of the given 
quasi-orthogonal ennuple, reduce to * 


« a,-eye,-w*+(%*«, 

to -(%&-. 

where 

C a j‘ m -C*, — -ff Ta j, 

and 

(a,p)eR, (<r,r)eS. 


(a.n) 




Equations (2.12) will be referred to as the general identities for the quasi- 
orthogonal ennuple (£, t M , h* a ). 

When «=4 and r = 2, these give the formulae (4.12) of my previous 
paper. There the £ are denoted by £*, 17*, and the (“* by £'*, a U four 
vectors of the quasi-orthogonal ennuple being null. The coefficients here 
denoted by At are there represented by A t , B h C t , D h while B ltJ and 
Cf* are represented by - Wj and - 


3 . Parallel (r+j)-PLANE of Nullity r 

Any normal basis for a parallel (r+s)- plane of nullity r contains r null 
and s unit vectors. By (1.2), r < min(r+r, n-r-s), and so, in par¬ 
ticular, r < n-r-s, whence r+s < n-r. Let R, S u S t be the suffix- 
ranges defined by 

(i, a, . . ., r)(r + i, f + a. r +j) (r+r +1, . . ,,n-r) 

R S x S t 

S t being non-existent when it happens that r+s=n-r. S x and S t 
together comprise the range hitherto denoted by S, so S=(S lt 5,). 

Let the null vectors of a given normal basis for the (r+r)-plane be 
& ■ (£. • ■ ■. £t) (so a < R), and the non-null be h\ s (Aj +1 , .... Aj + ,) 
(so ixe S x ). Then the recurrence-formula: for the parallel (r +j)-plane 
have the form 

ia,J m + Rawj^t ( 3 > 3 ) 

(3.3) 

where (a, ff)e R and (ji, v) c S u these merely stating that the covariant 
derivatives of the basis-vectors are linearly expressible in terms of those 
vectors. 



* Cf. Wong, lee. eit., equation (3.a). 



83 


Parallel Planes in a Riemannian V n 

The orthogonal vectors A£ may be taken as r + s of the vectors of 

a quasi-orthogonal ennuple, which may be completed by the choice of r 
null vectors and of n- 2 r-s unit vectors, say k\ = (A* +#+1 , . . A*_ r ) 
(so that 6 e The last set might be denoted by AJ because they, 
with the AJ,, together make up the whole set of n - 2r unit vectors Aj 
[a e (Su 5*)] of the quasi-orthogonal ennuple, but they are denoted by 
k\ to emphasize the fact that they are not part of the basis of the given 
(r + j)-plarie. We thus have the sets of vectors £, A£, Aj, respectively 

r, r, s, n - 2r - s in number, of which A^ define the parallel (r+j)-plane, 
the {'s and A's being auxiliary. The general identities (2.12) for the 
whole ennuple are now 

(a) + + F^+F^, ' 

(A) i+ + 

(<) ^--^r^+V+V. ( 

(d) - - Gffi - 4 

where (o, fJ) t R, (ji, v) e S u ( 0 , </>) e S t . These equations are the same as 
(2.12), but with the range S' of the indices a, r in (2.12) split into the two 
ranges S lt S t , the h { a , for a e S t , being written as A’s. 

For the f’s and A’s to define a parallel (r-t-j)-plane, (3.4) (a) and (e) 
must be respectively of the form (3.2), (3.3). Hence, in (3.4) (a), both 
and are zero, while in (3.4) ( c) both and are zero. 
Thus all the F' s are zero, and the right-hand side of (3.4) (a) reduces to 
the f-term. Also, because is skew in a, r, by (2.13), in (3.4) (d) 
is zero because is so. Hence recurrence-formulae appropriate to the 
case of a parallel (r + $)-plane of nullity r are of the form 

(«) 

(A) C 

to A^- -<%$, +ff^, 

{d) K.j - - 

where (a, ff) c F, (p, v) e S u ( 6 , <f>) e 5 , and h? i =e t k* r 

Equations (3.5) (a), (r) are the ones defining the given (r + j)-plane IT. 
The fact that (3.5) (a) involves only the £ shows analytically that the null 
part to of the parallel (r + r)-plane is also parallel, as stated in § 1 above. 
The non-null part, defined by the Aj,, is in general not parallel, because 
the right-hand side of (3.5) (c) involves the £’s as well as the A’s: indeed, 
“non-null part" is really a misnomer, because a partially null parallel 
plane in general has no uniquely defined non-null part, the A* being 
replaceable by linear combinations of themselves and of the £. Equations 




84 


H, 5 . Ruse 


(3.5) (a), ( d) show that the r null vectors and the n - 2r - s unit vectors 
Aj, all of which are orthogonal to one another, are a normal basis of a 
parallel (n - r - j)-plane. This is the (n - r - j)-plane P conjugate to the 
given (r + j)-plane II, its null part, which has the for basis, being the 
same as the null part w of II. Similarly, (3.5) (a), (c) t (d) together show 
that the (»-r)-pIane p of basis (£, Aj,, Aj) is parallel. It is the plane 
conjugate to the null part ra of the original plane. All of the planes 
intersect in m and are contained in /. 

Equations (3.5) (A) are auxiliary, the four sets of equations (3.5) being 
recurrence-formulae for the whole n-plane V n , which, of course, contains 

II, P , ro and^. 

Equations (3.5) are the most general ones for a partially null parallel 
plane in V n . Various special cases are obtainable from them. For 
example, if the given parallel plane is wholly null (case J=o), then there 
are no vectors Aj, but there are »-2r vectors k\. In (3.5) (A) the term 
in A* disappears, and (3.5) (c) is non-existent. Otherwise the equations 
remain formally the same.* Again, if it happens that n is even and 
r = }«, all the vectors of the quasi-orthogonal ennuple are null, and the 
general identities reduce simply to 

(a) 

V) 

with a, p running from i to J«. Equations (3.6) (a) are the recurrence- 
formulae for the null parallel i«-plane, which is self-conjugate, (3.6) ( 6 ) 
being auxiliary. If it also happens that C* ss o, then the are a basis 
of a second null parallel Jw-plane having only the o-plane in common 
with the former. 

The case when n is even and r = jn thus contains nothing of exceptional 
interest. The case when n is even and r= 1 has, however, certain 
peculiar features, and the same is true when n is odd and r=\(n -3). 
These arc considered in the following sections. 

4. Null Parallel (£«-i)-Plane in V n (« even) 

Suppose n even, and let 

Ms i, 2, 3, ... , ($»-1); 

{My jff) — I, 2, 3, . • • | (|s -1), js. 

Suppose also that V n admits a null parallel (J« - i)-plane having as normal 

* The vectors k\ are auxiliary, and pairs of them could be replaced by pairs of 
auxiliary null vectors defined in terms of them by formulse similar to (3.6), in which 
case (3.5) would be modified in form. 




Parallel Planes in a Riemannian V n 85 


basis a set of independent orthogonal null vectors . 

The recurrence-formulae for this plane are then, of the form 



(41) 

Form a quasi-orthogonal ennuple, consisting entirely of null vectors, 
by taking, with the $[, a further null vector £j tn) —thereby making a set 

of \n ^-vectors £ s [£^ f —and \n null vectors 

general identities for this ennuple are of the form 

By (2.12), the 

i " ip + 

( 4 -a) 


( 4 . 3 ) 

where (a, p) e ( M , ^n), B aU and C“j being skew in 0, j 8 . 


Equations (4.1) must be the same as the first \n - 1 of equations (4.2), 

namely 


& j “ + [ J e (J/ ( W ]» 

( 4 - 4 ) 

the (i»)th equation (4.2) being 



( 4 - 5 ) 

Comparison of (4.4) with (4.1) gives 


A%^~o (pcAf) 

(4.6) 

and 


Bpfij = 0 [peM, fie(Af, |«)]. 

( 4 - 7 ) 


The last equation shows that, for any fixed j, the first -1 rows of the 
(i») x ($») matrix B afij are zero. Therefore, since the matrix is skew, 
it is wholly zero, so 

B aN = o- (4-8) 

Thus (4.4) and (4.5) become 

-e&l [(p,v)'M], (4-9) 

( 4 - 10 ) 

(4.9) being the same as (4.1). These two equations together show that 
the vectors = [£‘, are a (normal) basis of a (null) parallel $»-plane. 
Now consider the equation obtained by taking a = }» in (4.3), namely 
{On* _ [/3 € (M, *«)]. (4.11) 

By (4.6), the last term on the right of (4.11) is equal to and 

so involves only the vector out of the set Also, since is 
skew in a, j8, we have C ( *’ ,x **]=o, so the first term on the right is equal to 
with p summing over M only. Thus (4.11) is 

(4-ia) 



86 


H. S. Ruse 


and this, with (4.9), shows that fj and £*" > * are a (normal) basis of another 
(null) parallel J»-plane. 

We thus have in V n the given null ($w - i)-plane having fj(p e M) as 
basis, and the two null }»-planes having [£j, and [fj, f*"*] as 

respective bases. The in-planes intersect in the given (|» - l)-plane, and 
all are contained in the parallel (in + i)-plane having [£, f*"*] as 

basis, which is non-normal because, by (2.9), and £*** are not 
orthogonal. This (^« + i)-plane is the parallel plane conjugate to the 
given (\n - i)-plane, and is of nullity \n - I. 

So we have the theorem: When n is even and V n admits a null parallel 
(\n- \yplane, it also admits two null parallel \n-planes intersecting in 
the given (\n - l)-plane. These planes are all contained in the parallel 
(\n + I )~plane of nullity ($« - i) conjugate to the given (\n - i )~plane. 

The special case of this theorem for n =4 was given in § 6 of my previous 
paper. 

5. Parallel Planes of Nullity (J*-i) in V n (n even) 

If, in V n ( n even), wc have a parallel (r + j*)-plane of nullity r, where 
r = \n- 1, then, because r < n-r-s t by (1.2), we have s < 2. Hence if 
a parallel plane is of nullity (\n - 1), it is either (i) a (\n + i)-plane; or 
(ii) a 4 «-plane; or (iii) a (j» - i)-planc. Case (iii) was dealt with in § 4 
above, and so, in effect, was case (i), since the conjugate of a parallel 
(\n + l)-plane of nullity (In - 1) is a null ($n - i)-plane. The only distinct 
case that remains for consideration is therefore (ii). 

Suppose, then, that V n admits a Jtt-plane II of nullity - 1). Take 
a normal basis for II consisting of the null vectors ^ , . . , £^ n) ] 

and of the one unit vector h i (indicator e x = ± 1). Form a quasi-orthogonal 
ennuple by taking, with and h\ a set of \n - 1 null vectors and another 
unit vector, k* say (indicator e |). Then by (2.12) the general identities for 
this ennuple arc of the form 


Hj “ A \i $0 + 

(Si) 

-A^ + GfaV + Gfa#, 

(S.») 


(S- 3 ) 


( 5 - 4 ) 


where p 9 a run from 1 to \n - 1, and where H tlj = - H x 

Inasmuch as (£, h { ) are a basis of a parallel plane, their covariant 
derivatives arc linear functions of them alone, so we have, by (5.1), 

and, by (5.3), 



87 


Parallel Planes in a Riemannian V n 
Thus (5.1), . . (5.4) reduce to 



(s-s) 

ft - Cft - APJ* + Gto* + G^‘, 

(5-6) 


(S-7) 


(5-8) 


Of these, (5.6) is auxiliary and requires no further consideration. Equa¬ 
tions (5.5), (5.7) are the recurrence-formulae for the given £»-plane II, 
which is based on ^ and h*. Equations (5.5), (5-8) show that there is 
another parallel |«-plane, based on and k\ intersecting II in its null 
part, i.e. in the null ($« - i)-plane m based on By (5.5), or by Walker’s 
Theorem 3.1, this null part is by itself parallel, and therefore, by § 4, 
V n also admits two null parallel |«-planes intersecting in to. Moreover, 
if a, b are any constants, we have, by (5.7) and (5.8), 

and this, with (5.5), shows that the vectors ah* + bk t , for any constant 
a, b, are a basis of a parallel -plane. If aW + bi* is non-null, as will 
in general be the case, the |«-plane is of nullity (^« - 1); but if the ratio 
a\b has either of the values that make aW + bh* null, then the Jw-plane 
is wholly null. These two values of a/b in fact give the two null Jw-planes 
intersecting in m already referred to. All of the planes are contained in 
the parallel (J» + i)-plane of nullity ($» - 1) conjugate to a. 

Summarizing these results, we have: If n is even and the V n admits 
a parallel \n-plane of nullity (|« - 1), then (i) its null part a is a null 
parallel (\n - i)-plane\ (ii) it admits a pencil of parallel \n-planes all 
intersecting in ro; of these , all except two are of nullity -1), the excep¬ 
tional two being wholly null \n-planes ; (iii) all the planes are contained 
in the parallel ($« +1 )-plane conjugate to ro, which is of nullity ($« - 1). 

The special case of this theorem for n =4 was given in § 7 of my previous 
paper. 


6 . Parallel Planes of Nullity J( w- 3) in V n (# odd) 

When n is even, the greatest possible nullity of a plane is It was 
seen above that a parallel plane possessing this maximum nullity had no 
other properties of special interest, but that parallel planes of nullity 
(i« - 1), one less than the maximum, had exceptional properties. 

When n is odd, the maximum nullity is ^(« - 1). Here again it turns 
out that planes having this nullity are not of special note, but that excep¬ 
tional features do arise when the nullity is one less than this, namely 

K*-a). 



88 


H. S. Rust 


For an (r f j)-plane of nullity r we have r < n-r-s, and hence, if 
r = \{n - 3), we have s < 3. The only possibilities are therefore: 

(i) r + r-i(*-3), j-o; (ii) r + j-fta-1), J-i; 

(iii) '+*-!(* +1), (iv) r + j-i<* + 3 )> *- 3 - 

As the conjugate of a null $(w-3)-planc is a $(» + 3)-plane of nullity 
possibility (iv) is equivalent to possibility (i). Similarly (iii) is 
equivalent to (ii). Therefore the only distinct cases that need considera¬ 
tion are (i) and (ii), and it will be found that only (ii) has exceptional 
features. 

We first obtain general identities in the form needed in the present 
section. Let 

= 1, 2, 3> • ■ — 3)» 

(N, m) = i, 2, 3, . . ft*-3), 

In V n choose a quasi-orthogonal ennuple consisting of m [ = J(« - 1)] null 
vectors m other null vectors and one non-null vector A* (indicator e). 
Then the general identities (2.12) have the form 

0) + [ (6.2) 

w W 1 . ) 

where (a, ft) c (N t m) t the coefficients G^ i of (2.12) being here written 
Gj because there is only the one non-null vector A*. There is no 
term in (6.2) (r) corresponding to A 4 because H n} in (2.12) (c) is skew 
in ( 7 , t. 

Suppose now that we have in V n a null parallel |(« - 3)-plane (case (i) 
above), having as basis the null vectors ^ p (p « N). Then the recurrence- 
formulae for the plane are of the form 

ilj “ f(p» 0) < W- (6.3) 

Form a quasi-orthogonal ennuple by taking, with the given vectors 
a vector & [tn = \(n - i)], thereby making a set ? a = ($*, £,), and also 
K«-0 null vectors and one unit vector A*. The vectors of this ennuple 
will satisfy identities of the form (6.2), and (6.2) (a) must therefore include 
(6.3). Rewriting (6.2) (a) for the cases a * $(»-1) and a=$(»-1)=»», 
we get 





(6.4) 

(6.5) 




89 


ParalUl Planes in a Riemannian V % 
Equation (6.4) must be the same as (6.3), and so 

[p*W,P'W, «)]. 

Because B afi} is skew in a, j 3 , it follows that 
Thus (6.4) becomes (6.3), (6.5) reduces to 



(6.2) ( 6 ) remains unaltered, and (6.2) (V) becomes 



the last equation involving only the one {-vector £ m< . 

The recurrence-formulae for the case at present under consideration 
are the three “boxed” equations (6.3), (6.6), (6.7), with (6.2) ( 6 ) as an 
auxiliary. 

Equation (6.3) is the original recurrence-relation for the given null 
K n ~ 3)-plane. The three boxed equations show that the vectors t* m , 
and A* are a (non-normal) basis for a parallel J(« + 3)-plane. It is 
the $(« + 3)-plane conjugate to the given null J(« - 3)-planc. 

In general the only other parallel planes in V n are the o-plane and 
M-plane, so this case, unlike the corresponding one for even n, is not 
exceptional. Exceptional features might arise by accident, so to speak; 
for example if F mj happened to be identically zero. In that case, by (6.3) 
and (6.6), the V H would also admit a null parallel J(»- i)-plane based 
on the vectors £ = (£*, and, by (6.3) and (6.7), a parallel J(« - i)-plane 
of nullity $(« - 3) based on (££, h*). 

This brings us to the last case to be considered in the present paper. 
Suppose, without reference to the remark just made, that a V n (n odd) 
admits a parallel J(»-i)-plane II of nullity \(n - 3) (case (ii) above). 
Let the vectors of a normal basis be the null vectors (pc N) and the 
unit vector A*. The recurrence-formulas are then of the form 

( 6 - 9 ) 

There is no term in A* on the right-hand side of (6.9) because, if there 





90 


H. S. Rust 


were such a term, Hjth* say, we should obtain H t =o on multiplying (6.9) 
by h t and using the fact that h* is a unit vector orthogonal to the vectors 

t P W- (6.2) W). 

If we form a quasi-orthogonal ennuple consisting of h i and of 

selected vectors [*« = - i)] and £“*[a€(jV, m)] r we again have the 

general identities (6.2), and these must include (6.8) and (6.9). As before, 
we separate (6.2) (a) into the two sets (6.4) and (6.5). Comparison of 
(6.4) and (6.8) shows that B ?n~ 0 and hence, as before, that 

B a n - o [(a, ft) t (N, #»)], (6.10) 

and also that 

- o [p t N, m - i(« -1)]. (6.11) 

Similarly, a comparison of (6.2) ( c ) and (6.9) shows that 

F„-o [ft c (N, »)], (6.is) 

and that 

GJ'-o [m - J(« -1)]. (6.13) 

By (6.10), (6.11), (6.12), the first 4 (»- 3) equations (6.2) (a), that is, 
equations (6.4), become (6.8) with zero. So we have 

[(p,o)tN]. (6.14) 

The remaining equation (6.2) (a), namely (6.5), becomes, by (6.10) and 
(6-12), _ _ 

frnj - 4 i.ftp IP « W «). tn --1)]. (6.1s) 

The first $(» - 3) of equations (6.2) (A) become 

?\i “ + G$eh< [ptN,^t(N, m)l 

and are auxiliary, while the remaining equation (6.2) ( b ), corresponding 
to a = |(« - i) = m, becomes 

[otN,m — \(m -1)]. (6.16) 

The first term on the right is C ma j { i g (at N) and not [/3 « (N, m)] 








91 


Parallel Planes in a Riemannian V n 

because C"^=o on account of the skewness of in a, JS. The second 
term involves only C mt of the set because of (6.11), and there is no term 
in k* because of (6.13). 

Finally, (6.2) (c) reduces to (6.9), namely 



(aeN). 


(6.17) 


The boxed equations are recurrence-formulae for the case at present 
under consideration. Equations (6.14) and (6.17) are the recurrence- 
formulae for the original \(n - i)-plane II of nullity |(» - 3), of normal 
basis (fa, h { ). Equation (6.14) alone is the recurrence-formula for its 
null part to, which is the null parallel l(n - 3)-planc of basis fa. Equations 

(6.14) and (6.15) together show that V n also admits the null parallel 
\(n - i)-plane, II, say, having fa = (fa, fa) as (normal) basis and inter¬ 
secting IT in its null part to. Similarly, (6.14) and (6.16) show that 
(fa, £ m ‘) are a basis, normal because is orthogonal to all the vectors 
it (fit N), of another null parallel $(« - i)-plane II,, which also intersects 
II in nr. 

The +1)-plane P of basis (fa, fa, ( mf ), which basis is non-normal 
because fa, ( mi are not orthogonal, is of nullity \(n - 3) and is parallel, by 

(6.14) , (6.15), (6.16). It is the conjugate of the original \(n- i)-plane II. 
Similarly the J(» + i)-planes P u P it of respective (normal) bases 
(fa, fa’ *0 and (fa, ( mi , h*) are of nullity {(n - 1), and are parallel. They 
are the conjugates of n it II,. Lastly, the J(« + 3)-planep of (non-normal) 
basis (£|,, fa, ( mi , A*) is of nullity |(« - 3) and is parallel. It is the conjugate 
of the null part nr of the original plane II. Clearly p contains all the 
planes previously considered. 

So we have: If n is odd and V n admits a parallel J(»- \)-plane II 
of nullity \(n - 3), then it also admits the following parallel planes : 
(i) the null J(» - 3 )-plane to, which is the null part of II; (ii) two null 
\(n - i)-planes II,, II,; (iii) the \(n + $)-plane p, of nullity ±(n- 3), 
conjugate to a; (iv) the J(« + i)-plane P of nullity |(» -3), conjugate to 
II; (v) the two $(« + 1 )-planes P lt P t , both of nullity J(»- 1), conjugate 
to II, and II,. All of these planes intersect in to and are contained in p. 

I have nowhere in this paper considered the conditions of integrability 
of recurrence-formulse for parallel planes, nor the existence of spaces V n 
containing parallel planes of the types discussed. Conditions of integra¬ 
bility for the particular case of a null parallel 2-plane in V t were given in 
my previous paper [equations (5.5), ( 5 - 6 )], and similar conditions could be 




92 


Parallel Planes in a Riemannian V n 

found for the general formula (3.5) (a), (c) of this paper. The existence 
of spaces containing parallel planes of the types considered above follows 
from theorems due to Walker,* who has obtained canonical forms for the 
metrics of such spaces. 

# A. G. Walker, “ Canonical Form for a Riemannian Space with a Parallel Field of 
Null Planes”, Quart. Journ. Math. (Oxford, Second Series), I, 1950, 69-79; and a 
further paper, at present in the press (May 1950), to appear in the same Journal. 


[Issued separately June 16, 1950) 



A Further Note on a Problem in Factor Analysis 


93 


VIII.—A Further Note on a Problem in Factor Analysis.* 

By D. N. Lawley, M.A., D.Sc., University of Edinburgh 

(MS. received November 5, 1949. Read January 9, 1950) 

In a recent paper f by the author several unfortunate omissions and mis¬ 
statements occurred which it seems desirable to correct. I am indebted 
to Professor T. W. Anderson for pointing these out to me. 

Firstly, it was stated that the matrix of true loadings K was defined 
uniquely if it were chosen so as to make /=JC'T-*K a diagonal matrix 
(adopting previous notation). A further condition is however required, 
namely, that no two elements of J should be equal. This condition would 
almost certainly be satisfied in practice, but should have been stated 
explicitly. 

It is also necessary to order the columns of K in such a way that the 
rth column corresponds to the rth largest element of /. The elements of 
/ are then arranged in order of magnitude along the diagonal. A similar 
consideration applies with regard to the columns of jt 9 the matrix of 
estimated loadings. We are thus able to set up a correspondence between 
the columns of K and those of it. 

A misprint has occurred in § 3 of the paper. Line 3 of page 396 
should read as follows:— 

“ o, if either i < m or J < m'\ 

These restrictions imply the equation 

kt-\a- 6 )~ 0, 

which, to a sufficient approximation, is equivalent to equation (3). 

In § 4 the conditions 

»*<r-o (»'< r) 

arc equivalent to the assertion that the above-diagonal elements of 

1 u V M W JJ 

K> 7 ’ -1 a are zero, whereas in actual practice K is chosen such that a T~*K 
is diagonal. What has in fact been done is to neglect errors of estimation 
of the loadings in the first (r - 1) factors when estimating those in the rth 
factor. Hence the statement that for r * s the covariance between l ir and 

* This paper was assisted by a grant from the Carnegie Trust for the Universities of 
Scotland. 

+ “Problems in Factor Analysis", Proc. Roy . Soc. Edin ., A, lxii, 394-399. 



94 A Further Note on a Problem in Factor Analysis 

l f , is zero is incorrect. The formulae ( 6 ) represent the partial variances 
and covariances of l w l Vt . . l nr for given values of the loadings in all 

previously estimated factors. 

While these partial variances and covariances would seem to be of more 
practical use than the total ones, the values of the latter could probably be 
obtained by employing previous methods. 


{Issued separately June 16, 1950) 



A Measurement of the Velocity of Light 


95 


IX.—A Measurement of the Velocity of Light* By R. A. 
Houstoun, M.A., D.Sc., F.Inst.P., Natural Philosophy Depart¬ 
ment, University of Glasgow. (With Four Text-figures.) 

MS. received August 23, 1949. Read November 7, 1949) 


Synopsis 

The author has already described a new method of measuring the velocity of light, 
which replaces Fizeau’s toothed wheel by a piezo-quartz. This acts as an intermittent 
diffraction grating, and it interrupts the beam 200 times more rapidly than Fizeau’s 
wheel did. 

The present paper describes the application of the method to the measurement of the 
velocity of light in air. The total length of path was 78 metres. The frequency of 
interruption was measured by comparison with the Droitwich radio station. The result 
reduced to vacuum is 299,775 kilometres per second and is in agreement with other 
recent determinations, but, as a result of the experience gained, it will be possible to 
increase the accuracy at least ten times. 

In the Proc. Roy. Soc. Edin. I have described a new method of measuring 
the velocity of light and have given results obtained by this method for 
water (Houstoun, 1941, 1944). The present paper describes an application 
of the same method to the measurement of the velocity of light in air. 
Only one previous measurement of the velocity of light in air has been 
made in this country, that of Young and Forbes f in 1881, the eleven 
determinations considered by Birge (Birge, 1941) in his review of the 
subject being made in America, France and Germany. So this paper 
breaks a long silence. 


Method 

As has been explained in the previous papers, my method consists in 
replacing Fizeau’s toothed wheel by a rectangular piezo-quartz. This has 
the property of acting as an intermittent diffraction grating when it is 
placed in an alternating electric field. Stationary clastic waves are set up 
in the quartz, the refractive index being increased at condensations and 

* The research was assisted and this paper was assisted in publication by grants 
from the Carnegie Trust for Universities of Scotland. 

t George Forbes, it is interesting to note, was an F.R.S.E. and Professor of Natural 
Philosophy at the Andenonian College in Glasgow, and his determination was made 
across the Firth of Clyde from Wemyss Bay to Innellan. His father was Professor of 
Natural Philosophy in the University of Edinburgh, and Principal of the University of 
St Andrews. 



96 


X. A . Houston* 


diminished at rarefactions; thus the surface of a light-wave travelling in 
the quartz at right angles to the elastic wave becomes corrugated, and 
these corrugations cause diffracted beams to appear on each side of the 
main beam twice in each electrical period. 

Let SQ (fig. i) represent a beam entering the quartz and QM a first 
diffracted beam. If this is reflected back on its path by a mirror M and 
arrives at Q when the grating is in action, it is diffracted back again to S; 
otherwise it travels on to S'. The advantage of this method of interrupting 



the beam over the toothed wheel is that, as I have used it, the beam 
is interrupted 200 times as rapidly. There is, however, one important 
difference. The speed of the wheel is varied to make the intensity of the 
returning beam a minimum. But the quartz behaves as a grating only 
when one of its free periods is in resonance with the alternating field; 
thus the minimum can be obtained only by sliding M along QM. 


z 


X 


Y 



o 

U 





K10. 2. 



Fig. 2 shows the experimental arrangement used for the present 
measurement. P is a 100-candle-power Pointolite lamp, which was 
enclosed in a box; the light was reflected by a 45° glass prism and focused 
by the condenser C on the slit of a spectrometer. After being rendered 
parallel the beam passed through the glass plate G and the piezo-quartz Q, 
which lay between the condenser plates of an oscillator, not shown in the 
figure. It then travelled a distance of about 19 metres to the plane mirror 
M, was reflected to the lens-mirror combination B, returned on its path, 
and was reflected by the glass plate G through the telescope to the eye of 
the observer at O. 

The deviation of the first-order spectrum is about 35'. The adjustment 




A Measurement of the Velocity of Light 


97 


of the mirror M and lens-mirror B is done best with the direct image. The 
end of the quartz was bevelled, therefore, so as to give the direct image a 
deviation of 35', and a glass wedge W inserted to remove this deviation. 
The adjustment of the line was done with this wedge in position. When 
the oscillator was started and the wedge removed, the first-order spectrum 
automatically used the same line. 

The oscillator is represented diagrammatically in fig. 3. It is a 
push-pulPone employing two Osram D.E.T. 12 valves, and the quartz 
was placed between the plates of the condenser C. These were horizontal; 
their surfaces measured 40 x 13 mm.* and they were about 1 mm. thick. 



The quartz measured 3*51 x 11-03 x *2*975 mm., and the upper condenser 
plate rested loose on the top of it. The 11-03 edge was in the directipn of 
the optic axis. The parallel rods are 26 cm. long and 4-5 cm. apart. 
Tuning was done by moving a copper rectangle backwards and forwards 
by a screw motion. The position of this rectangle is indicated by the 
broken line; it was about 8 mm. above the plane of the parallel rods. 
Filament current was supplied by four large storage cells, and the anode 
current was taken from two 120-volt high tension batteries. The observer 
had the slow motion for tuning the oscillator close to his right hand, and 
its frequency range was from about 104 to about mo megacycles per 
second. The distance GQ was about 40 cm.; this wa9 necessary to allow 
stray light from the quartz to diffuse away before reaching the object-glass 
of the telescope. 

The apparatus was set up on two tables in a research room on the 
second floor of the department, and the beam left this room by a window 
and travelled the full length of a students* laboratory to the mirror M; 
measurements could be made only in the vacation when the laboratory 
was not in use. The mirror was a heliograph one, of diameter 13 cm. 

The length of the path was measured by a steel tape supplied by 
James Chesterman & Co., Leeds; this hung in a single catenary under a 
stretching force of 20 lb. It was correct at 68° F., and its coefficient of 
F.R 4 .K.— VOL. LXIII, A, 1949-50, PART I 7 




98 


X. A. Houstoun 


expansion was 6-2 x io - * per degree Fahrenheit. The one end was 
hooked on a pillar at H ; the other hung either over a cylinder at X or 
over a cylinder at V, carrying weights at its end, and it could be lifted 
easily from the line XH to the line YH, or out of the way to the line ZH. 
The tape had only two graduations, at o and at 20 metres, and the positions 
of the quartz, mirror M and lens-mirror B were obtained by fixing 100 and 
20 cm. graduated metal scales on to it and measuring from these points. 

The lens-mirror combination B consisted of an achromatic lens of 
25 cm. focal length and 4 cm. diameter, and a thin glass mirror mounted 
at its focus. They were fixed in a tube the direction of which could be 
adjusted very accurately, and which slid along a metal optical bench of 
length 110 cm. 

The measurement of the velocity falls into two parts: the measurement 
of the electrical wave-length— i.e. the distance light travels in air during 
one electrical period—and the measurement of the electrical frequency. 
After the mirror M and the lens-mirror B were aligned, the observer 
tuned the oscillator into resonance with an harmonic of the quartz and 
removed the wedge W. This displaced the first-order image into the 
position previously occupied by the direct image. He then adjusted the 
lens-mirror, so as to make the intensity of the image a minimum. The 
lens-mirror was close to his right hand when his eye was at the eyepiece. 
Twenty settings were made in succession and the mean taken. The 
electrical wave-length was derived from this mean in the following 
manner. 

The tape was switched into the line XH and the positions of the 
right end of the quartz and the glass-air face of the mirror noted. The 
lens-mirror was then moved to the end of its scale, the tape switched to 
the line YH, and the positions of the end of the tube of the lens-mirror 
and the glass-air surface of M again noted. The tape passed about 1 cm. 
above the top of M, about 5 cm. above the quartz and about 1*8 cm. above 
the end of B. The readings were taken by laying a straight edge on the 
face of the mirror, and by placing a celluloid square on the top of B or on 
the surface on which the condenser plates rested. They could be made 
almost to one-tenth of a millimetre. They gave the uncorrected length. 

Corrections have to be made for (a) half the optical length of the 
quartz, (b) the optical thickness of the mirror M, (e) the object-glass of the 
lens-mirror, and ( d) the distance from the end of the tube of the lens-mirror 
to the reflecting surface. Their values are 0*87, 0*91, 0*27 and 1*89 cm. 
respectively, giving a total of 3'94 cm. The group index of refraction 
was used, but the difference is hardly appreciable. 

It will make matters clearer if we give the numerical values for one 



99 


A Measurement of the Velocity of Light 


set of readings, the twelfth set, selected because they were average values. 
The twenty observations in this set gave a mean scale reading of 19-45 cm. 
The uncorrected length was 3916-9 cm. This makes the corrected length 

39i6'9 + 3'94 “ *9’45 - 39° 1- 4 cm. 

As the light traverses this going and coming, we have finally for the 
length of the path 

- 7802-8 cm. 

For a minimum this must equal (« +i)^, where n is an integer. For the 
frequency used n was 55. Thus, in this case, 




ITT 


The frequency was measured by a wavemeter consisting of an oscillator 
working through the range 8 to 12 megacycles per second, and a one- 
megacycle per second quartz crystal oscillator, which controlled multi¬ 
vibrators at frequencies pf 100 and 10 kilocycles per second. For the 
measurements recorded in the present paper my piezo-quartz was operated 
on its 135th harmonic. Its frequency was then slightly greater than 
13 x 8200 kc., but during a set of readings the quartz heated up and the 
frequency generally fell below this value. No attempt was made to keep 
the temperature constant; the change was small and occurred at a uniform 
rate, so it could be eliminated mathematically. It was found advantageous 
to keep the 10 kc. multivibrator permanently switched off and measure the 
frequency of the piezo-quartz from the 8200 kc. note. There was no 
difficulty in detecting the one note from the other; the 100 kc. multi¬ 
vibrator gave a slow and leisurely resonance, the piezo-quartz one that was 
louder and much sharper. 

During the twelfth set of observations the piezo-quartz resonance 
moved from the multivibrator resonance to A division of the condenser 
scale below it. A division on this scale is equal to 1-96 kc. The mean 
frequency of the piezo-quartz was thus 

(8200 - A x 1’96) x 13 -106,596 kc. 

These observations therefore gave for the velocity of light in air 
281-18 x 1-06596 x io 8 * 2-9973 x io 10 cm. per sec. 

The wavemeter was standardised by means of the Droitwich carrier 
wave; the frequency of the latter is 200 kc. per sec., and it is maintained 
very closely to this value. It was received by an aerial, amplified, and put 



lOO 


X. A . Houstonn 


on the X plates of a Double Beam Cossor oscillograph. The ioo kc 
multivibrator was put on the Y plates. The pattern was usually almost 
stationary, and never rotated faster than once in three seconds, so it was 
never necessary to adjust the frequency of the one megacycle crystal. 


Results 

In the final determination 400 settings were made. These were made 
in 20 sets, 20 in each set, during the month of August 1949. They were 
consecutive settings; once a setting was made, the reading was entered 
on the record, no matter how "bad” it was, and once a set was started, it 
was completed. If the work did not appear to be going well, longer rests 
were taken between the settings. I made all the settings myself. The 
frequency was taken four times during each set, temperature and barometer 
were read after each set, all the distances were measured after each set, 
and everything was thrown completely out of adjustment and readjusted 
before the next set was started. The results for each set are given in 
IO* cm. per sec. in the following table; they are for air, and they are 
corrected for the expansion of the measuring tape. 


1 29961 

2 29968 

3 29979 

4 29961 

5 29979 

6 29976 

7 29973 

8 29959 

9 29972 
10 29963 


11 29966 

12 29973 

13 29974 

14 29972 

15 29975 

16 29965 

17 29977 

18 29962 

19 29972 

20 29969 


They are graphed in fig. 4. The points in this graph represent the 
mean results obtained from the first 20, 40, 60, 80 . . . observations, 
and show the approach of the mean towards its final value. In the 
review already cited Birge makes an estimate of the value of the velocity 
on the basis of the most accurate determinations hitherto made, and in a 
more recent critical examination of the field, Dorsey (Dorsey, 1944) makes 
a similar estimate. Birge’s and Dorsey’s means, reduced to air, are 
shown respectively by the upper and lower lines in the graph, and my 
final point lies between them, though, the probable error being what it is, 
this must be regarded as an accident. 

My final value for air is 299,698 + 9 km. per sec. Using Birge’s 
correction, this becomes 299,775 ±9 f° r a vacuum. The probable error 
was calculated from the 20 means given in the preceding table. The 
following table gives other values for a vacuum:— 



toi 


A Measurement of the Velocity of Light 


Birge, estimated .... 
Dorsey, estimated 
Karol us and Mittclstaedt, 1929 
Michelson, Pease and Pearson, 1935 
Anderson, 1937 .... 
HOttl, 1940 .... 

Anderson, 1941 .... 


. 399776 * 4 

- 399773 * 1 

. 299786 ±IOi 

. 399774 ± 4 

399771 *10 - 
299771 ±10 
. 299776 * 6 J 


As revised 
by Birge 


Michelson, Pease and Pearson's determination is the mean of 2885 
observations, Anderson’s second determination of 2895 observations. 



Fio. 4. 


An interesting point has cropped up in connection with the probable 
error. The observer sits with his chin on a rest and his eye at the telescope 
and sweeps the lens-mirror back and forward along the bench, until he is 
going from just perceptible increment on the one side to just perceptible 
increment on the other. He then crosses this region in little steps, usually 
six, and, when he thinks he is doing this satisfactorily, stops at the third 
step. This should bisect the difference. But if he has the habit of 
stopping too soon or too late, the observations fall into two groups according 
as the final steps were inward or outward, and each of these groups has 
its own probability curve. The errors of the one group are to some extent 
compensated for by the errors of the other. If this is disregarded, the 
calculated probable error will be too high. In my measurement of the 
value of the velocity of light in water (Houstoun, 1944) I comment on the fact 






102 


R, A, Houstoun 


that the agreement with theory is much better than would be expected 
from the probable error. In that determination the settings were made 
“in*' and “out’* alternately, and I have now no doubt that the value given 
for the probable error is too high. In the present investigation there were 
200 “ins’ 1 and 200 ‘'outs”. They were made in no definite order but just 
as they came, only the last six or seven being directed, in order to make the 
totals balance. Although some of the 20 sets were well balanced, others 
were predominantly “in” or “out”. Unfortunately the bearing of this 
point on the probable error did not occur to me until after the paper 
recording the directions of the single settings was thrown away. So I 
calculated the probable error, as if the readings were independent, and the 
value given, 9 km. per sec., is definitely too high. 

I have not seen the above point referred to before; it may affect the 
estimates of the accuracy of some of the other determinations of the velocity 
of light. 

I have been asked whether the accuracy of the later measurements is 
greater than that of the earlier, and whether a better mean could be obtained 
by neglecting some of the earlier. The later measurements are slightly 
nearer the mean, but that by itself is not a reason for preferring them, and 
I think it is safer, as I have done, to give all observations equal value. 
Later measurements are sometimes more consistent owing to habit¬ 
forming; in the present case this was guarded against by displacing the 
mirror M an inch or two between the sets of readings. 

The resonance bands of the piezo-quartz are about one kilocycle wide. 
The low frequency edge is sharp and very bright, and the intensity diminishes 
gradually from this edge to the other side. The sharpness of the minimum 
depends on the intensity, there being one particular frequency which gives 
the best results. 

During the measurements the temperature of the laboratory varied 
from 64 to 8o° F. and the barometer from 75-4 to 76-2 cm. It was not 
necessary to apply a correction for pressure. The average correction for 
the temperature of the measuring tape was about 2 km. per sec.; the 
maximum correction for the same was 19 km. per sec. 

Discussion of the Results 

The chief source of error lies in determining the position of the minimum. 
Various optical devices were tried to diminish this, but they were all 
found inferior to the direct method. Some observers are much better at 
determining the minimum than others; I am above the average, but I 
have had one observer who could fix it three times as accurately as I can. 



A Measurement of the Velocity of Light 103 

With a powerful arc lamp it would be possible to increase the range ten 
times. So with the same apparatus and procedure the accuracy could be 
increased thirty times. 

The accuracy might also be increased by increasing the frequency of 
the oscillator, as it is the ratio of electrical wave-length to length of path 
that matters. I worked on an electrical wave-length of 281 cm. If this 
were diminished to 28*1 cm., with the same range the accuracy would be 
increased ten times. It has yet to be shown, however, that a quartz would 
give bright diffraction bands on this wave-length. 

For interrupting the beam, Mittelstaedt, Anderson and Hiittl used a 
Kerr cell in an alternating field. Mittelstaedt received the beam through 
another Kerr cell, which was placed in a field of the same frequency, and 
obtained his minimum by altering the frequency. At his minimum the 
intensity went down to zero. Anderson used two paths of unequal 
length. After passing through the Kerr cell the light divided at a mirror, 
and the two beams reunited at the same mirror and fell on a photoelectric 
multiplier. The current from the multiplier passed through a circuit 
tuned to the frequency of the interruption. If the groups arriving by the 
two paths were out of phase, the response was a minimum. The principle 
is thus that of the neutralising condenser. It is arranged that the groups 
are out of phase by drawing back a mirror; the difference of path is then 
an odd multiple of half the electric wave-length. Hiittl received his beam 
on a vacuum photo-cell, the anode potential of which varied with the same 
frequency as the Kerr cell. Anderson worked at 19*2 megacycles per 
second and Hiittl at 4-12 megacycles per second, and it is doubtful whether 
their techniques would work at my frequency. There might be trouble 
with the transit times of the electrons in the photo-cells. This is a point 
that should be investigated. 

Rotating mirrors, toothed wheels and rotating prisms are now definitely 
outclassed; they are too sensitive to the friction of their bearings, and they 
require great distances if the result is to be accurate. No one working on 
these lines will do better than Michelson, Pease and Pearson. The Kerr 
cell must be water-cooled, and requires a voltage of 6000 volts or so to bias 
it and an alternating e.m.f. of about 2000 volts to operate it. It seems to 
have reached its limiting frequency, it is somewhat messy to work with, and 
its possibilities have been fully explored. The requirements of the piezo¬ 
quartz are more modest and it gives a feeling of aesthetic satisfaction. 
There are no systematic errors or concealed traps associated with it. But 
the minimum is unsatisfactory, having about one-third the intensity of the 
maximum, and to say that its determination gives no feeling of aesthetic 
satisfaction is an understatement. In any case, either of these methods 



104 


A Measurement of the Velocity of Light 


is capable of giving a value for the velocity at least ten times as accurate as 
anything that has hitherto been obtained. 

A length can be measured by the time that light takes to traverse it, 
and it is interesting to note that this method is at present being developed 
in Sweden (Bergstrand, 1943) for use by the Ordnance Survey in measuring 
the sides of triangles' The apparatus is based on Huttl's method. 


Acknowledgments 

In making the frequency measurements recorded in this paper I have 
received much help from Dr Walter McFarlanc; I have also received 
assistance in tuning up the apparatus from two of my students, Mr G. H. 
Robertson and Mr J. I. B. Macfadyen. 


REFERENCES TO LITERATURE 

Bergstrand, Erik, 1943. “Measurement of Distances by High Frequency 
Light Signalling”, Ark. Mat. Asir. Fys. t xxjxA, N: o 30. 

Birge, Raymond T., 1941. “The General Physical Constants”, Reports on 
Progress in Physics , vm, 90. 

Dorsey, N. Ernest, 1944. “The Velocity of Light”, Trans . Amer. Phil, Soc 
xxiv, part 1. 

Houstoun, R. A., 1941. “A New Way of Measuring the Velocity of Light ”, 
Proc. Roy. Soc. Edin. % A, lxi, 102. 

—1944. “A Measurement of the Velocity of Light in Water”, ibid., A, 
lxii, 58. 


{Issued separately fune 16, 1950) 



( 105 ) 


X.— The Reciprocity Theory of Electrodynamics.* By H. S. 
Green and K. C. Cheng, University of Edinburgh. Communi¬ 
cated by Professor Max Born, F.R.S. 

(MS. received July t, 1949* Read December 5, 1949) 


SYNOPSIS 

Thin paper represents the application of the Principle of Reciprocity, formulated in a 
previous communication, to the outstanding problems of classical and quantum electro¬ 
dynamics. 

The first step consists in the formulation of a reciprocally invariant Lagrangian 
function for a system of electrons in interaction with the electromagnetic field. A study 
is made of the unaccelerated motion of an electron, and this is subsequently extended to 
embrace the problem of an electron in arbitrary motion. It is found that the usual 
difficulties of classical electrodynamics do not appear. The methods of the earlier 
paper are applied to the derivation of the Hamiltonian energy of electron and field, and 
this enables a quantized formulation of the theory to be given, which also does not lead 
to the usual divergence difficulties. 


i. Introduction 

It is well known that the present state of the theory of the electron, in 
both classical and quantum electrodynamics, cannot be regarded as 
satisfactory owing to certain difficulties of a fundamental nature. In the 
classical theory these difficulties are associated with the divergence of 
the self-energy of the electron when it is regarded as a point, and the 
failure of various alternative models with a finite structure to meet 
relativistic requirements. Early hopes that the application of the quantum 
theory would remedy these defects were discouraged when it became 
apparent that the very process of quantization introduced divergent 
terms additional to those already present in the classical theory. Even in 
the “hole theory” of Dirac the calculations of Weisskopf (1934) revealed 
a divergence of both the electrostatic and electromagnetic self-energy of 
the electron, which further recent developments by Schwinger (1948, 
1949) among others have not removed. 

Naturally very many devices for the elimination of the divergence 
difficulties have been suggested. The most direct method available 
consists in the subtraction of the divergent integrals in the way proposed 
by Dirac (1938, 1942) and Pryce (1938), or by Heitler and Peng (1942). 
The objection which now has to be met by all of these “subtraction” 
theories is that where the elimination of the divergences is not incomplete, 

* This paper was assisted in publication by a grant from the Carnegie Trust for the 
Universities of Scotland. 



io6 H. S . Green and K . C. Cheng 

no account can be given of phenomena such as the recently discovered 
41 Lamb effect”, which by common consent must be explained with the 
help of the disappearing terms. 

Various other methods may be distinguished and are summarized in 
an excellent review by Pais (1948), which includes a very full bibliography. 
It is clear that the necessary departure may be made in two main direc¬ 
tions: cither one may introduce non-linearity in the radiation field, as 
attempted by Born and Infeld (1934), or, without modifying the linear 
nature of the field equations, various procedures, eg, those suggested by 
Bopp (1940, 1943), Podolsky (1942) and McManus (1948), all equivalent 
to the introduction of multiple derivatives of the field variables into the 
Lagrangian, may be employed. 

The method here adopted has its genesis in the theory of reciprocity 
proposed by Bom (1939), a development of which was made recently, in 
association with the present authors, in a paper (Born, Green et aL t 1949) 
here called Q.T.R. It is first necessary to specify the form of the wave 
operator F ki (x % x ') introduced in Q.T.R. appropriate to the electro¬ 
magnetic field; the field equations are then easily obtained by the general 
method which is equivalent to the application of the principle of least 
action. The electrons arc represented as point singularities in the electro¬ 
magnetic field, and a reciprocally invariant Lagrangian is derived from 
which the equation of motion of the electron can be obtained by variation 
with respect to the electronic co-ordinates. 

The resulting field equations are readily solved for the motion of an 
unaccelerated singularity, in a relativistically invariant way which intro¬ 
duces the reciprocally invariant four-dimensional angular momentum 
tensor. For accelerated motion it is also possible to obtain an exact 
formal solution which can be shown to coincide with Wiechert’s solution 
of Maxwell’s equations at large distances r from the electron, and for 
short distances can be expanded in powers of r and the fundamental 
length a . The result is finite everywhere, and to a first approximation 
reduces to the difference between the advanced and retarded solutions of 
Maxwell’s equations for the field of a point singularity as postulated by 
Dirac. The equation of motion of the electron can then be derived, also 
as a power series in a , and in first approximation it is identical with that 
obtained by Dirac. There is, however, an important deviation from 
Dirac’s theory in the form of the interaction between two electrons at 
short distances; this indicates that it should be possible to obtain a classical 
theory of pair creation and annihilation without the divergent terms of the 
usual theory. 

As a preliminary to the quantization of the theory, a Hamiltonian 



The Reciprocity Theory of Electrodynamics 107 

formulation is essential; this may be obtained by an extension of the 
theory developed for pure fields in Q.T.R. It is easy to write down an 
exact expression for the energy; unfortunately this is not of the type to 
which ordinary methods of quantization are readily applied, as it contains 
high derivatives of the field variables with respect to time. It would not 
be correct to treat these derivatives as independent variables, and they 
must therefore be eliminated; this is accomplished with the help of the 
field equations. 

In order to apply the usual perturbation methods of quantum mechanics, 
it is necessary to separate from the Hamiltonian a term which may legiti¬ 
mately be regarded as small. Any term involving the self-field of the 
electron does not satisfy this requirement, and it therefore has to be 
separated from the resultant field—a procedure analogous to the separation 
of the longitudinal field in orthodox electrodynamics. The resulting 
theory is equivalent to the usual one apart from the presence of a factor 
exp {(k-pl£y~k t }j4h t in the interaction energy, where p and E are the 
momentum and energy of the electron before the emission, or after the 
absorption, of a photon with momentum k. This rule, secured by a 
proper application of the correspondence principle, ensures that the 
contribution of all the "round-about” transitions to the self-energies and 
cross-scctions are small and finite, while leaving the real processes un¬ 
affected, except to a small and rather interesting degree which, it may be 
hoped, improvements in experimental technique may confirm. 


2. Lagrangian Formulation 

In the original paper Q.T.R. (Bom, Green et al, t 1949) a general 
principle was established according to which the Lagrangian density of 
any pure field must have the form 

where F(p % p') is a reciprocally invariant operator satisfying the condition 

(2irh)-*^F(p, p) exp {ip>x*\ti)dp - F(x, x) (2.2) 

provided the variables and x t are regarded as imaginary. This principle 
has been shown to lead to the prediction of a set of rest-masses for the 
elementary particles in sufficient agreement with those experimentally 
observed in cosmic radiation. The purpose of the present paper is to 
examine the further application of the “principle of reciprocity’' to classical 
and quantum electrodynamics. 



to8 


H. 5. Green and K. C. Cheng 


For the pure electromagnetic field, p(x , x') is of the form 

pfcifo x^-A k (x)A x (x’)> (2.3) 

where k and / are tensor affixes subject to the conventions of general 
relativity theory but referring to the Galilean metric 

gu ~ t s (x, r/). ( a - 4 ) 

The components of the four-vector A * = (A, <f>) are then the potentials of 
the electromagnetic field, and for such a vector field containing only 
particles with vanishing rest-mass one has 

F t i -/(/*)( “ Pgki + cpkPi), P-P h Pk = (a-5) 


where ^ is a disposable constant, and /(/*) =0 may have no real roots. 
The further condition that F k \ should be self-reciprocal determines the 
solution uniquely, with 

r- 4 , & % f(P) “■ exp (- 7 */ *£*) (2.6) 

obtained by substituting >t = 2 and « = 3 in the equations of § 5 of Q.T R. 
Then the field equations for the pure electromagnetic field, corresponding 
to Maxwell’s equations in orthodox electrodynamics, are 

~ 4 p*PdA k (x) exp (2.7) 


The four-vector is specified completely only when the auxiliary 
condition 


±.A,.o 

8x t 


(J.8) 


is supposed to hold identically; this condition is introduced not of necessity 
but in order to simplify the subsequent calculations. Then (2.6) reduces 
to exp ( -Pj2b l ).PAk(,x)=o, and this can be further simplified by the 
observation that there is no non-vanishing solution of the equation 
exp ( - P/2i v )x=o. For let 

X(*) “ exp dp* -<fp# 4 ; 

then, by definition, 

exp (- Pi a£*)x(*) “ exp ( -p„p k ltP + ip k x u lh)dp u \ 

which vanishes only if <n(p)=- o identically. From this it follows that 


PJ k {x)--h*nA k (.x)-o, (a. 9 ) 

which is equivalent to Maxwell’s eauations. 



The Reciprocity Theory of Electrodynamics 


109 


The energy and momentum four-vector P m of the field, derived 
straightforwardly in accordance with the principles explained in Q.T.R., 
may be obtained as follows:— 

P*i(p, /) - h-*e- (P+niu ‘{ -P^P”gu + 2(p k p, +p k pd), 

GS (A P') - *~^ P+ nitb \ -P m 'g» + *( 8 T/i + 8 TPJ 

+ (A, +/».)( ~P — p ~pg* + 4 PkPi)}\ (*■10) 

P n -\\ < (-Ptfc + G i Hp n + G* H p m )A'‘A'- > </*<■> 


, (*/ 8 B* BB., SB* d£ k \ , , 
f J\ dx t dx l m 0x 4 ex’*/ 


(a.n) 


where the pseudo-potentials B k are defined by 


(2.12) 


It is clear from these expressions that the theory of the pure electro¬ 
magnetic field is indistinguishable from the orthodox theory provided 
A k is replaced by B k . 

When there is a system of electrons in interaction with the electro¬ 
magnetic field, it is assumed that the field equations (2.7) remain valid 
except at certain singular points which may be interpreted as the positions 
of the electrons. Then * 


e “ Pl * b ' □ A k (x) - <?„ V 5 {x -*, 0 (/)}*(o/^ 

i 

-*•2 - XmWW - /«,(*)}—<*, (2.13) 

where the first line can be obtained from the obviously Lorentz-invariant 
second line by expressing the proper time s of the electrons as functions 
of f w , and performing the integrations with respect to / (<) . These field 
equations can be derived from the Lagrangian 

L-J{ < 1 + (a i 4 ) 

where R* stands for the right-hand side of (2.13), and is a function 
of the electronic co-ordinates and their time derivatives only. 

* A dot placed above a symbol, thus: sb, here and elsewhere, denotes differentiation 
with respect to time. = V (4 *)* is the electronic charge in Heaviside units, which are 
used throughout this paper. 



iio H. S. Green and K. C. Cheng 

Let 2(^(0 k* the particular solution of the equation (2.13), from 
< 

which the general solution A* may be obtained simply by the addition 
of the complementary solution A?, which represents a pure electro¬ 
magnetic field, thus 

A k -£Afo+A*. (^S) 

i 

Then the Lagrangian (2.14) can be expressed in the form 
L-jJ < F n {A*A'' - 204 (V' + + 

as can be seen by the substitution of 2 , e for Jt k . It will be 

i 

shown in the following section that can be expressed in the form 

L„, - - ij < > dx<»K (a.16) 

This expression is obviously a function of the electronic co-ordinates and 
their derivatives only, and will be shown to reduce to 

Leo-( 21 7 > 

which is the generally accepted Lagrangian for a free particle with mass 
m a . The entire Lagrangian may then be‘written in the form 

L-ij < F kX {AW - 2 (^* 4 +^''+^} > d*S'\ (a.18) 

which is required by the Principle of Reciprocity. 

It may be remarked that although the preceding theory is used through¬ 
out subsequent sections of the present paper, there exists another formu¬ 
lation in which F{p, p’) is regarded as self-reciprocal provided the 
condition 

exp Of#* -A*'*)/#•# <4, #' <4, *F(x, x’) (2.19) 

is satisfied, instead of (2.2). This leads to the formula 

F(p,p‘)-e-l>+*'*»>{t k f»ff k _ i{pk p> +p ‘ ipt)) ( a . ae ) 

instead of (2.10). It can be seen, therefore, that the consequences of the 
two formulations of the theory are identical apart from the value of b, 
which, assuming (2.20), would have a value y/2 times the value implied 
by (2.10). 



The Reciprocity Theory of Electrodynamics 


in 


3. Unaccelerated and Uniformly Accelerated Motion 

of Electrons 

The solution of the equation (2.13) is readily obtained for an electronic 
singularity X, whose path is a straight line, so that 

exp {-Pl*P)UA k (x) - V^t)jc, 

r-x-x,-.x-x # -v(/-/ 8 ) t (3.1) 

where Xg is the position at some initial time It will be convenient to 
obtain the solution of the somewhat more general equation 

F(P)tfx)-e { ’o»(r). (3-a) 

By definition one has 

P(P)<K X ) - exp (*>***/*)# M> . 

HP) - (2ir*)-»{^(x) exp (- i Pt x*mdx"\ (3.3) 

Also 

(2irA) _, Js(r) exp (- i Pk x k jh)dx lv 
- (in *)- 1 exp [1 p • (Xg - V/,)/*] • S(A - v*p lc ). (3*4) 

Hence 

P(PkP k )>fKj>) - -▼•P/f) «p [<p(Xo - vtj/h ]; 

exp (,>***/*>#"> 

= r 0 (27rt)-»J exp (- »pT#)// , {(vp/f)» - (3.5) 

The x 1 -axis may be chosen in such a way that it coincides with the vector 
▼; then, writing 

y-(ik-Oiy\ /*./*). ^“|k|, 

u - (V^i. F|. ' 1 ), * “ I u I. (3-8) 

it follows that 

exp ( -ik'Ulh)/F(-h*)Jk 

- ae,y«r sin (,**/*)*/F( - #)<**■ (3-7> 



112 


H . 5. Green and K. C. Cheng 


With F(P) = ~%- % P exp ( -Pjlb*)^ as required by (3.1), one obtains 
J k {x) - etfvdfic )- 1 Y(u ) 9 

Y(u) « 2 ^ 0 ( 27 r)”*u“ l j^ sin {ku!h)kr x exp (-Pjilfydk 

- a ( 27 r)“' 8 /l *“ 1 J ,f exp (- $y*)dy, (« - hjb). 

The pseudo-potentials B k defined by (2.12) satisfy 

exp (- PfoP) □ B h - e<p&(f ) 9 
and are accordingly given by 

B k (u) « e 0 yv k (ac)-iZ(u) t 

Z(u) = 2fl(27r)“ 1 i# _ 1 J*sin (kujh) exp ( -PjiPy 


dk 

] ~k 


e«/« 

- an~ s/l (2«)“ 1 1 exp ( -y*)dy. 


(3.8) 


(3-9) 


(3-10) 


The scalar magnitude u is a relativistic invariant, as may be shown 
by expressing it in terms of the four-dimensional angular momentum 
tensor, defined by 

”* kl - (* k ~ A) A - (** - *o)/S. A - m Y t,k - (3- 1 0 

The scalar of this tensor is 


lm kl tn kl «= wM/fr* - (r A v) 1 /* 4 } 

— wM* 1 . (3.12) 

If /=/ 0 + | X-Xo \/e, u reduces to the “retarded distance” of the point 
(x, /) from the path of the singularity. For large u t both A k and B k given 
by (3.8) and (3.10) reduce to e 0 yv k (4nu)’ x , and accordingly coincide with 
the generally accepted potentials of the electron. 

The solution (3.8) may be substituted in (2.16) to obtain a first approxi¬ 
mation (with neglect of acceleration) to the exact Lagrangian of an 
electron in interaction with the pure electromagnetic field. With the help 
of (3.12) one has 

b , 3Y 8u dY 

—A, = e 0 yv k (ac)~ l ~ — - e a yv k (ac)~ l — m kl yv l !m<* -o, (3.13) 


so that the condition (2.8) is satisfied, and L, reduces to 




dz dZ 


8 x l 8x 


Jx<*K 


(3-*4) 



The Reciprocity Theory of Electrodynamics 


i»3 


Substituting for the intermediate solution 

Z(J )- *[(air)*^]- 1 -J exp{.pr/A + [(p-v/tf - pWJ/Q* 1 -(p-v/r)*)-rf/‘« (3.15) 
one obtains 

L»--«o/(2(aw) ,l/,, «y). (3.16) 


This is the customary Lagrangian of a free particle with mass m 0 given 

0L. 

by (2.17); the derived momentum and energy are -=»i 0 yV, and 

dv 


8 L 4 

-L, 4 V,— a circumstance which justifies the adoption of L, 

cv 


as the Lagrangian of the unperturbed electron. It will appear later, 
however, that m 0 is not the entire rest-mass of the particle, as a further 
contribution arises from the interaction with the electromagnetic field. 

The remaining two terms L/ and L*/ in the Lagrangian, represent¬ 
ing the electromagnetic field and the interaction respectively, arc 
most conveniently expressed in terms of the pseudo-potentials 
i?* —exp (- Pj^b % yA k Then L/ is given by Maxwell’s formula 


and L ef is reduced to 

Ur 


- e 0 v k (V" • 


( 3 . 17 ) 


( 3 *«) 


in the first approximation. This formula shows that the effect of the 
present theory is essentially the same as the introduction of an electron 
with a Gaussian charge distribution in the ordinary electrodynamics; 
this suggests an intuitive interpretation of the theory of reciprocity accord¬ 
ing to which measurements of the position of an electron are subject to an 
uncertainty of the order of the electronic radius, and the observed charge 
density is therefore not singular, but distributed in space according to 
the normal error law. The concept to the electron as a point singularity 
in the field A k may nevertheless be retained. 

Variation of the Lagrangian with respect to the co-ordinates of the 
electron will obviously give in this approximation an equation of motion 
which does not properly take account of the radiation damping. To 
obtain a second approximation, it is necessary to evaluate the expression 
exp Pl^b*. r t 5 (x-x«) for an electron with constant acceleration; this is 



1 14 


H. S. Green and K. C. Cheng 


most easily effected by first writing it in the form 

<r p/4 *’ - x.) - ( 27 r*)-®J exp(- ~^j-exp (ik-(x -x,)/*)</k. (3.19) 


For brevity set a= — ik’X./A, so that d*a(di* vanishes by hypothesis. 
Then, since by straightforward computation 


one has 


"a*-"-", (k > n) 


( 3 - 2 °) 


I 3 *" . «"/ k\ /A-2«\ 1v .. 

( *e“\. v ,,„v 

exp \ 4^* fif*/ ^0 4^/ *1 ;?o(«»)i(«-»oi 

Putting » = m + / and summing first over / and then over m , one obtains 

exp(-£ - £ exp { -ilcx/A + £*(k-v/r)*/4^}, (3»> 

£-(i-iVjflfX./r 1 *)-•*. (3.23) 


where 


Similarly 

exp ^) v<r °“ -^j-d^expf-ik-x,/* + £*(k-v/c) l / 4 ^*}. (3 24) 

Hence finally 

exp (Pj^lF) ■ t»*S(x - x,) =■ (2ir*) - *|£ exp{/k* (x - x,)/A - i*/2 J* + £*(k* v/c)*/!#*} 

(P* + i‘t t Fk-+v k lt*)</ 1 c.. (3.25) 


This result will be required in a later section. From it the exact solution 
of the field equations for a uniformly accelerated electron, valid at all 
distances from the singularity, can be written down at once as the usual 
solution of the equation □Z? i =e 0 exp (/*/4^ , )t> k S(x - X t ). As the right- 
hand side is non-singular this solution is finite everywhere; but as it is 
rather too complicated to be of value in practical computations, an alter¬ 
native method of solution which gives the values of the potentials for an 
arbitrary motion of the singularity at both large and small distances from 
the singularity has been developed, and is presented in the following section. 


4. The Field and the Equation of Motion of a Charge 
in Arbitrary Motion * 

For an exact treatment of the field of a charge in arbitrary motion, the 
function (2.14) may be written in the form 

L * Lq 4 Lj. 


* This section written by K. C. Cheng. 




The Reciprocity Theory of Electrodynamics 


"5 


Here, according to (2.5) with (2.6) and (2.8), L 0 is 


Lo“ 



3 l* 


8A* 
dx | 



dx 1 ) 


dx tu , 


( 41 ) 


and L 1( represents the interaction term and self-energy j n the form (2.17), 
thus: 

L *“ ~ + ‘ ^)*} 1 Px 8(JC ‘ “ ^ dx "' ds ’ (4 S) 

where i distinguishes different electrons, as in § 2. 

The total action is defined by 

T-flA/, ( 4 . 3 ) 

from which the equation of motion of both field and particles can be 
derived from the principle of least action by variation with respect to A k 
and the co-ordinates 4 > 0 )' 

The variation with respect to A k gives 



On the other hand, variation with respect to 4 qC 0 gives 



d 8 L 8 L 
ds jxfc 
ds 


or 


(4 s) 

Setting 



one finds the equation of motion of the charge 



dV* *&>- 

(4.6) 


Writing for convenience 

+ Affo /(o1-/($*+/$*, 



H . S. Green and K. C. Cheng 


\ (6 


where Afy t Afy satisfy the equations 

□ exp (- i«* □ )A$ - - e t jc fyx {< „ II h(x k - xf^ds, 

JGS 

□ exp ( - W □ - - 2 - ^ 

j cjas i 


and the dash sign excludes the term i = /, one has from (4.6) 



( 4 - 7 ) 


( 4 . 8 ) 


The second term on the left-hand side of (4.8) will be worked out and 
discussed below. 

According to (4.4) the field equation can be written in the following 
form:— 

□ exp (- \a* n )■ A * == - * 0 Ji5{x - X 0 (/.)}8(/ - />*/<*«. (4.9) 

where v k is the four vector (V, c) representing the velocity of the electron, 

and the proper time s has been replaced by the time co-ordinate ^ = 

From now on, for the convenience in discussion, attention will be confined 

id d d 

to the problem of one electron. Writing /? = - — and J = —, so that 

C vt cX OX 

□ - J - D 1 , and operating on both sides of (4.9) by exp one has 

□ exp (la 1 /? 4 )■ A k « - f 0 J exp (l* 1 *J)8(x - x,)8(/ - t 9 )v k jcdt^ (4.10) 

Using the relation 

exp 8(x - x g ) = (27 exp (- (x - x # )*/2 a % ), 

which can be easily verified by comparing the Fourier components of both 
sides, (4.10) becomes 

□ exp ( \a % &)A k « exp { - (4.11) 

To preserve the customary idea of causality one has to choose, among the 
possible solutions of the D’Alembertian differential equation (4.11), the 
retarded one, namely 

A k - efacPcYKiir)- 11 * exp (-1 x-x' |/c) | x-x' |~ l 

exp [-{X'-X/ 0 }VJ^*.(& ( « ( 4 -ia) 



The Reciprocity Theory of Electrodynamics 117 

where the third term inside the 8-function takes account of the retardation. 
Introducing the abbreviations 

R-x'-x, r«x-x„ r — | r I, cos 6 -R-TjRr t 

one has 

A fc -1/ 0 (aexp (- ) a % JF)^Rv*8(t - /, - Rjc) sin 9 

•exp [ - (R* + r 1 + arff cos 9)jtcf]d9dpdRdt e . (4.13) 

It should be noted that for t 9 >t, 8(/ - /, - Rjc) =0; thus the region 
of integration for is between the limits / and - *>. Performing the 
integration first over p and R , then over 6 , one obtains 

ex P (- w fc (/“0 sin 0 

_ 3 

exp [ - {**(/ -/,)* + - / f ) cos 0 + f^lid^dd 


■=lw(V( 21T )' a )~ l ex P (- 

exp [ - (<</ - O - 

40 


Let 

- exp [- (<-(/- /.)* + r)*/2« , ]}<//,. 

( 4 - r 4 > 

then 

t,= 

( 4 -IS) 


di) 

( 4 - 16 ) 

dt. =C V " 

where t» r = (v • r)/r. 



Now any function /(x, /,) of the variables x and can be expressed in 
terms of X and £ or 17 with the help of (4.15); then it is obvious that 




and, by Taylor's expansion, 


& /n 


/{X, /.[x, (£+*<)]>-ZJ-^/tx, (o+^/)», 


(4-17) 


(4.18) 


where, in view of (4.15), 

A*, /.[X, (o +tt)]} -[/(x, Olmt.i 

with /, defined by the condition r(t,)=c(t and “ret.” outside the 
bracket denotes as usual the retarded value. 

Similarly, for any function g(x, (,), one has 

f(X, /«) “ 2 ~7-^"L?^X| /«)] td T ., 
n 1 


( 4 » 9 ) 



118 


H. S. Green and K. C. Cheng 


where t, is determined by the condition r(/,) = c(t, -t), and "adv.” denotes 
the advanced value. 

With the help of the definitions (4.15) and (4.16), (4.14) becomes 


A k * 


The two cases where r{t ) is large or small compared with a will be 
discussed separately below in (i) and (ii). In (iii) the explicit equation 
of motion of a charge will be deduced and be discussed. 


(i) For Large Distances r >> a 


The second term in (4.20) is negligible because of the smallness of 
the exponential factor, and, as r\a is large, one can replace the limits r(t) 
by infinity. Then 


2{tn)*'*a 






v k d£. 


(4.21) 


On using equation (4.18) one obtains 


v* * 1 r 1 


Hence 



(4*2) 


(4.22) shows that at large distances the four-potentials are identical 
with those obtained by Wiechert. The error introduced in replacing r(t) 
by infinity is of the order of e~ r ‘ lut , which is very small when rja is large 


Writing 


(ii) For Small Distances r«a 


A*- 


2(2ir) ,ll a 


(P* + Q k ), 



where 


Tht Reciprocity Theory of Electrodynamics ng 


Using (4.18) one obtains 


0 «t \ 2 / LK^'PrjJn*. 

and by separating the even and odd powers of D one has 


^,4 pt ] 

U-»r)L f V (a* + l)l ^ U-tV)L 
By a similar procedure one obtains 

J— 'W' «'r) W <? + t/ r)J»dT. 0 (a«+t)l lf(r+t/ r )J IMjr 

Hence 

-[—-] } + ** 

l Lr(r-OfUn,,. Lr(r+iv)J wW J 

with 

A>*=-j«-^v»y^ B ^ (VM) ». + ,/r__«•*_] _r_j:*_] 1 . ( v 

0 (a» + i)l lLr(r-Jv)J Irt . LK' + tvxIrtT./ 
Expanding K 11 in powers of a one has 

^■- s i(^-r4yW---~-)Wf— -1 -l-~ )• 

1 \ g (i-«-i)!( 2 W + i)!j \[r(r-t<,)J n< lr(r 1 f< r )»d,./ 

It is shown in Appendix I that 

y a lw (-) w wi 1 

0 (s - m - 1 )! (j« + 1 ) I “ (17^ ij(j^7)V 


■f—1 

LKr-t'rlJ, 


^.£(1^.,. ! J[jL] -[——I ). 


Then 



H. S. Green and K. C. Cheng 


It is well known that the solutions —-—- I , I ~~—r I 

LK*-*v)_L. V<f+**»*« 

can be expanded in power series of * thus 

[ v* 1 * (-\n 

i —d “ 2 ^- 7 -ov"- 1 *'*/'], 

r^i <4 .as) 

LK<^ + »r)J«lT. 0*> 


and hence 


\ v r)J ret. U(' + *v) J»dT./ M” - 0 (2/l + l)l 


Now for the term Q k one has, by applying (4.18) and (4.19), 


1 


Q k = e 


0»HJo L'(*-»r)-U. 

Jo LK* + *V)-Lt. 


By using (4.25) one obtains 




2 {; 

■ 0,fl»e0 1/ 


r®»-am na« rin-i 

—- T—^P*IC --- 

(2 m) I (2 n - 2 tn) I ( 2m + 1) I 


n-Im+l jyXn+* 1 


o («0> r 


Since 


because 


fKc) v <0 

zH (£- r)*»v*ledi - Z>*"({ - r)*»»*/“*£» 

Jo Jo 


/(0Z*-*(C-o, 


* C/. Frenkel, Elecktrodynamik , i v 184 (1926). 



The Reciprocity Theory of Electrodynamics 


121 


one has 


Q* - „-•*•/» Y -Uota-f (£ - rr*-«"dfrlc 

0 ( a »)l ^0 

-2if -«■£■/* y 2J*»r—-(ia*) m r lm+ 1 *«/*/^l- 

^"0 L* I (2W + 2* + l) I * ' J 

Expanding 0 * in powers of a and keeping the first power in r, but all 
powers of t , f, etc., then one has 


Q*- 2 Z aU & 


(429) 


where 


d-(-wirr 2 (4.30) 

It is shown in Appendix II that 




2 f iv+s) r(j)l 
(1 - 2j)\r(j+f*+ 1) iW {43I) 


? 0 m 1 (s+m)I T(i)(i 

Thus, combining (4.24), (4.27) and (4.30), and using (4.31), one obtains 




where 


-fcL - [^LJ 


r(*)(i-«)” 0 ( 2 /* + i)l (/i + j)I 


r*v k lc. 


(4-3*) 


(433) 


For the purpose of illustration the first two U)’s are computed in 
Appendix III. The following are the results of computation:— 





(4 34 ) 


2 c/f+ £(❖•▼)«*+3(y.v)t)*} + ^y*f * 

<r r 

+ 'Ap.lf* - i|y»(v.r)p* + + ^y 7 (*-y)(y-r)]t?* 

+ ^/(V.r) + ^y»(y.v)(<r.r) + ^-(^.v)(y.r)+^y 7 (v*V)(v.r)Jti‘ 



122 


H. S. Green and K. C. Cheng 
+[i/(?*r) + ~(v.v)(v.r) + ^(v-v)(*.r) + ^(v-vX^.r) 

+ —/(♦•▼)*(v-r)+^ 5 /(?.v)(y.r) + ^(V.v)(v. r) + ^y»(v.v)(*-v)(v.r) 

+ ^V(T-*)(v-T)(v.r) + g y U(fr. V )»(v.r)J«»‘ ( 4 . 35) 

where v=—V. etc., and y = — —- ■ 
di Vi-v'/c* 

It follows from (4.32) and (4.33) that the potentials are all finite at the 
origin. 

(iii) The Equation of Motion of an Electron 

According to (4*8) the equation of motion of an electron is the 
following:— 

<mo 

where 


fall 


Using the three-dimensional space description, one has for v=i, 2, 3 the 
following vectorial equation. With A^c, A =(A V A % , A t ), 


where 


- r 0 ( E<" +-TA H'*»j + \ A (4.37) 


_ d id 

H - c “ rlA - 


Writing A u \ etc. in powers of a t 

A"'- Z f*- 2 






then, according to (4.32) and (4.34) one obtains 




" < f o(a«*)~ 1 (air)- , '*| - ^yv+y»/c*(V •▼)▼+y*(| - »•/<*)▼ J, 


(4-38) 



The Reciprocity Theory of Electrodynamics 


>23 


and similarly 


-v A H*?! = -v A — A A ( '\ 
c c dn 


(4-39) 

The addition of the two terms given by (4.39) and (4.38) gives the 
surprisingly simple result 

- ^ 0 (e _ 1 + -V A H_ 4 ) - W(^)-K2n)-*'*y^~n. 


Now the term 


^o( ^0 + 


^VAHjj- e.V 1 - - [▼/'(' + ®r)W} 


- - °r)]rrt. - lvjr(e + 




is simply Dirac’s radiation damping term, which, being evaluated, is 
equal to 

(mc*y \ ds* \ ds* ds* )ds f 

Thus, by neglecting the other powers of a, one obtains exactly Dirac's 
equation of motion of an electron: 

. d 1 (d* fd t x k d t x k \dn1 

Wo + - ***■•{*? + '-XdP' id)*) 


.<,y.(E l " + r'nH w ) l 

as 


(4-4o) 


which is obviously rclativistically invariant. 

However, there is a difference between (440) and Dirac’s equation of 
motion if one has under consideration several electrons interacting with 
one another at distances smaller or comparable with the fundamental 
length a . The difference arises from the right-hand side of the equation 
of motion. The external field due to other sources * 0 (E <a) +*~ 1 v A H (e) ) 
consists, according to Dirac’s theory, only of the retarded potentials. In 
the present theory this conclusion is correct if the particles arc separated 
by distances greater than a; for at large distances, the field acting on each 
charge is shown in (4.32) to be identical with that derived from the retarded 
potentials, whereas, if r is smaller or comparable with a t this conclusion 
no longer holds; the potentials given by (4.22) have to be replaced by 

* This term was actually given many years earlier by W. Behrens and E. Hecke, 
Nachrickten der AT. Gtstllschafi der Wissenschafttn zu Gottingen (1912). 



124 H- Green and K* C. Cheng 

(4.32), in which retarded and advanced potentials play symmetrical 
r 61 es. 

The first correction to the equation (4.40) proportional to a will now 
be evaluated. By applying the same procedure as that by which (4.40) 
was derived, the following equation may be obtained:— 


dx k \dx 
~7£"d?ids 




where 


-<,7(E (,I +VAH<•'/<), 
as 




(44O 


The detailed computational procedure is in Appendix III. This correction 
corresponds to that for an electron of finite radius which has been in pre- 
relativistic times studied by several authors for a rigid model (Herglotz, 
1903; Sommerfeld, 1904; Krazer, 1905; Bom, 1909). If the relativistic 
terms are neglected, the new formulae are almost identical with the old 
ones. The present form has the advantage of being relativistically 
invariant in every power of a. It shows that a relativistically invariant 
theory of an electron with a structure is possible and that only the meaning 
of the radius has to be re-interpreted. 


5. The Hamiltonian Formulation * 

The Hamiltonian energy and total momentum for an electron in 
interaction with an electromagnetic field are readily derived from the 
Lagrangian defined by (2.18). On account of the auxiliary condition 
(2.8), the Lagrangian may be written in the simplified form 


L-iJ < FA*A\ > -»•/<«)*,' 

F-F(/>, /') - - 


(SO 


In order to derive the energy and momentum, the general method which 
was explained in Q.T.R. may be applied to the first term, and the usual 


* Following sections written by H. S. Green. 



The Reciprocity Theory of Electrodynamics 125 

methods suffice for the remainder. The Hamiltonian energy so obtained 
is 

^4-jj <(~E+G*pt + G t 'p\)A*A' t > dx lt) + e 0 A t (x t ) + mj t (i (5.2) 

where G l and G 1 ' are defined so as to satisfy 

F(P,P) *» F(p, />') + (Pi -p t )G l '{p, />'),} {5 ' 3) 

and are explicitly 

a,- 

<?,- - *-<-<' ' '"'“■{a+(A +A>(' U ^“)4 (5-4) 

The total momentum is defined by 

P--f< (G 4 p +C‘'pVMi)rf*'» +- # A(x,) + « 0 v(i (5-5) 

so as to form, together with P At a four-vector. 

It can be shown that, by virtue of the field equations and equation of 
motion of the electron, 

F(p, P)A t (x) - %fi(x - x # ), 

the conservation law dP t ldt=o is satisfied. For, from the first of the 
equations (5.6) it follows that 

J < {F(p\ P’)Pi - F(p, p)p t }A*A' t > </*<»> - |^(x H ,), (5.7) 

and the left-hand side of this equation is, according to (5.3)1 
} < {- (Pi -pdF + (P t -pt)(G k p t + G*'ti}A*A- t > dx'». 

Since, if J(p, p ') is any function of p k and p' k , 

j < ip\ -pdfA'A > ^ 

--78j£[</W,>*«» 


(58) 



126 


H. S. Green and K. C. Cheng 


the equation (5.7) reduces to 

\ 7 A < ( " + Glpl+ G *' pi)A * A * > dxW " ~ a;r (x « ,) - (5 - 9> 

Combining (5.9) with the second of equations (5.6), the desired result 
dP x (dt- o is immediate. 

The Hamiltonian cP 4 may be expressed in an alternative form by 
substituting in (5.2) the expression 

/'« F(p , P f ) - \{F(P\ f) + F(p, p)} + j (A -*)«?' - CO (S.10) 

derived from (5.3). The term arising from \{F(p*, p’) + F(J>> p)} can be 
simplified with the help of the field equations (5 .6), and it can be seen from 
(5.8) that the term containing (p' - p) • (G' - G) will contribute nothing; 
the result is therefore 

cP k - ij < (Pi +Pi)(G t + G\)A*A\ > dx<*' + %.A(X U) ) + i - i^c*) -» (5.11) 

< (/>i+P*)(G\ + Gi)A“A’ k > </*<*>+v-p “> + me cKi -*•/<*)», 
where the canonical momentum is given by 

p ( «- Wo v(,+ i# A(x ( „). (S.12) 

tfv c 

With the help of (5.8) the momentum P defined in (5.5) may be written in 
the similar form 

P--[ < (P + P ’Mi + G-JA'A’i > dx' a) + - A(x, 0 ) + « 0 v(i -v*lc*yK (5.13) 

4cJ c 

Another expression for the energy-momentum four-vector is obtained 
from (5.2) and (5.5) by substitution from (5.4) and rearrangement of the 
terms in the following way:— 

Pi"*P{ f WqWi(i — v 1 /^*)” 4 + K u 

'^-**-*{ < (PJ> k 8 / ~P v Pi -p'tiB*B\ > <&<»> + e 0 J',(x.), 

/ A +PdPiPB'lti > </*<»>+< 0 {A(x.)-( S ., 4 ) 

where B k is again the pseudo-potential e~ FI 4 t> A k , and 

( 5 .,5) 



The Reciprocity Theory of Electrodynamics 


127 


From this the expression 

cP i - cF{ + 7. (P - PO + -1 + (cJC A - v- K) ( 5 . 16 ) 

for the Hamiltonian energy follows at once. It will now be shown that 


*-:[ 

Je A 

||(E* + H*)rfx < * > j 

E- -1 

(*+£)• 

8 

H-- , B. 


(S-i 7 ) 


Considering first the fourth component, one has from (5.14) 

~ B ‘ J B ' + B *0 B^dx™, ( S .i8) 


since □ 5 4 = ^" p/4 * , 8(x-x <0 ). After the elimination of the time deriva- 

1 • 0 

tives of B a with the help of the auxiliary condition -B A = - —*B, this 
reduces to 

~ B * AB *) - B-JB -(^• B )}^‘", (5.19) 


which, by a trivial integration by parts, gives the formula (5.17). 
Similarly 


--lBE B )- BtB 4- 4 } + {tl,- 


’•}] 


B + B ABMdx"\ ( 5 . 20 ) 


which leads to the vector component of (5.17). 

It may be shown further that the term cK k - V*k vanishes if terms in 
the acceleration and higher derivatives of V are neglected. For then, with 
the help of (5.8), 

+A)(A d*>" 

-S?*-*! < > <&<»> - - 8 }v k {s 4 k (x (i) ) - A k (x l0 )}, (5.21) 

so that 

•**i - *-‘J< (/« )v x pxPB*B\ > dx«\ < 5 «) 



128 


H . 5 *. Green and K, C. Cheng 


which vanishes under the condition stated, since 


v x p x PA k - *(* " *«>)} - o- 

It is usual to express the Hamiltonian in terms of a set of canonical 
co-ordinates and their conjugate momenta. The co-ordinates most 
convenient for the present purpose are those of the electron, and the 
Fourier components £ t (k) of /? ( (#), defined by 

k 

( 5 - a 3 > 

for a finite region Q of real space. Expressed in terms of these co¬ 
ordinates, however, the Lagrangian (5.1) and all derived quantities, 
including the Hamiltonian, contain all even derivatives of b f (k) with respect 
to time. It is clear on physical grounds, however, that the state of the 
system is completely determined given only the values of the co-ordinates 
and their first derivatives at any time * All higher derivatives have 
therefore to be eliminated. The elimination of the high derivatives of b x 
is easily effected with the help of the field equations, which can be written 
in the form 






and from which one obtains by repeated differentiation 




U/ 

-m" 




The equation of motion of the electron, which must be employed to 
eliminate the high derivatives of x, is of the form 

K =/(*«. K *. *,*,*,•••), (S-a6) 

where/ is a known function of its arguments. The method which can be 
applied is to write 

etc., 

at 

* A rigorous mathematical proof of this is postponed for future consideration. 



The Reciprocity Theory of Electrodynamics 


129 


and to proceed step by step, assuming that in the first approximation the 
motion of the electron is uniform, so that ♦ <l) = V (1) = . . . =0. Assuming 
for the present that the assertion italicized above is correct, the solution 
obtained in this way will be unique. 

The effect of a procedure of the kind just described on the Hamiltonian, 
as shown in Appendix V, is to reduce the number of canonical co-ordinates 
and momenta to one for each degree of freedom of the system. At each 
step in the elimination procedure approximations 

H<">(*„ S lt x, x, *<«>, *">, . . . ), p<«>(* t , i lt x, x, *<»>, V<»>, . . . ) 


to the Hamiltonian and momentum of the electron are obtained which for 
large n converge to their exact values. Since V, V, . . . occur always in 
the combinations a'tjc*, . . ., however, H (l * and p m are sufficiently 

exact for all ordinary purposes. To this approximation, as has already 
been seen, the term cK t - V*K disappears from the formula (5.16), so that 


H - eP A - *v*(i - v'/e*)* + (P - PO +K 
Also, according to (3.5), with F(P)=e 0 e~ PI4b ', 




1 Ce-W-'K^dk 


(527) 


(S-aS) 


In the next approximation it would be necessary to use (3.23); it is clear, 
however, that the improvement in accuracy so obtained would be very 
slight. Substituting, therefore, (5.28) with (5.23) into (5.15), one has 

(5-29) 

k 

to a high degree of approximation. 


6 . The Quantum Electrodynamics 

It is now possible to effect a natural transition from the classical 
theory of the preceding sections to the corresponding quantum formalism. 
The commutation rules for the electromagnetic field, established in §4 
of Q.T.R., are 




(6.1) 




130 H. S. Green and K. C. Cheng 

whilst those for the electron are simply 


(*, 3). ( 6 - 3 ) 

where p (il is the canonical momentum defined in (5.12). According to 
(5.11) and (5.13) one has 

+A)(G* + G*')A M' t > Jx™ +/P, (6.3) 

4^J 

-V*jc*)* + v*p (< \ 


Since p ( P is assumed to commute with A k , it follows, as in §4 of Q.T.R., 
that 

[/>„ A k (x)] - ; (M) 

also 

[A. 4J =/'A8f (/-I, 2,3); rA. *«)]-*'#▼/<•• ( 6 -5) 

It is easy to see that the components of P t will commute among 

ih d 

themselves; for, according to (6.4) and (6.5), one has [/> 4 , P]=— — P, 

which vanishes, according to the conservation law proved with the help of 
the field equations in the previous section; and the components of the 
cartesian vector P involve no operators which do not mutually commute. 
Since PA k ~- :* F/ “VfcS(X-X^) commutes with A kf (6.1) reduces on sub¬ 
stitution from (5.4) to the form 


- +p'<)[A k (x), A fix') j-A mS<X - XO, {(' - /), 


which is consistent with the usual commutation rule 

[B k (x) t 8(x-x') f (/'-/). (6.6) 

The latter may therefore be regarded as the fundamental commutation 
rule, from which (6. i) is readily inferred. 

The commutators of the components of F* u expressed by (5.17), are 
readily evaluated in terms of 

H (x)e p ' u, 8 (x - 

E(x,„) - Je(:c), p '*’8(x -x (() )</*<», 


(6-7) 



The Reciprocity Theory of Electrodynamics 


*31 


in the following way:— 

[K ^ - Kff’tW' 

- «j(£. Eff, - eJL -i*c 0 ff» 

[Pi, P& - i*e&, [Pi, P{] - i*erf» 

[K Ei]~i^[E t E t -E t E t , E'* + H' t ]dx (,t dx <a) ' 

- EE X - jvJL hW»’ « -i*ej* lt 

J \#X OTt / 

[/¥, - - «*i?„ [flf. ^0 - - 


One has further 

[A. H\ - J A, E W* - itMu*’ - o, 1 
tA, -**/A r A. /fl - - *#A. 

[A. ^wJtA. e'»+h'v* <8 ’'- -»*A- 


(6.9) 


Hence the field energy P{ commutes with the vector P / -«*S, which 
may be regarded as the momentum vector of the field, and satisfies 


[P{ - P{ - = ihejf t , etc. (6.10) 


The total field A k may be separated into the unperturbed self-field of 
the electron A k \ and a radiation field A r k , Then if the total momentum 
and energy, given by (5.14) in conjunction with (5.17), are expressed in 
the form 

cP = rp° +IE' A H r dx ,a) - <' 0 A r , 
cP k = <p\ + i|(E rt + H'V*'*' - ej\. 


p • and p\ may be interpreted as the entire momentum and energy associ¬ 
ated with the electron. On account of (6.5), the normal commutation 
rules 


[pi (A /-1, a, 3); IPi *«»]•iktjc (6.13) 


are satisfied. Substituting (6.10) into the formula (5.27) for the Hamil¬ 
tonian, the expression 


cpl - «or*(i - o*/c*)‘ + v.(p 0 -^J(E A H -E r A H- ~A 
+ ij{(E* + H*) - (E rt + H*)}*/**" + e*A\ 


(6.13) 



13 2 


H. S . Green and K. C. Cheng 


for p\ results. It will now be shown that on writing E = E r + E ( °, 
H = H r 4-H (0 , in this expression, 

+ -v.(E'« A H r + E f A H ,i] )/c}dx tn ~o (6.14) 

and 

+ -v.(E«> A H (i ')M^ ( * 1 -«oKi -»*/<*)*, (6.15) 

so that an analogue 

cp\ - w(i - + ▼• fp 0 - **) + ^4, w - 2«g (6.16) 


of Dirac's equation for the electron is obtained, with a mass m just twice 
the value m. The formula 


m « 





(6.17) 


for the mass of the electron found in Q.T.R. is thereby confirmed. 

To prove (6.14), one observes from (3 10) that Efp satisfies 

d ^ 

and ^ 0 =v • so that 

cX 


1 . 8 £^ /v y dEf{\ d 


c &x 


JW. 


E“>+v A H«>/f- -(1 H U) - v A E<«/r-o. 

cx 


(6.18) 


Applying these results to (6.14), the left-hand side reduces to 
-(1 - ‘ which vanishes on integration by parts, since 

d 

— • E f =o. To prove (6.15) it is necessary only to notice that, according 
to the calculations of § 3, the energy and momentum 

-Je i0 A H ,0 <fr (S) of the unperturbed self-field of the electron reduce to 

-»*/**) "* a °d ***▼(! - t/ 1 /* 1 )~* respectively. The validity of (6.16) 
is therefore verified. 

The Hamiltonian energy obtained from (6.it) by the substitution of 
(6.16) is 

-1/*/* 4 )*+V* (p 0 - +j|(E rt +H'V*'". (6.19) 



The Reciprocity Theory of Electrodynamics 


'33 


Here the interaction term -* 0 V'A f jc may justifiably be regarded as 
small and treated quantum-mechanically as a perturbation. Before the 
transition to quantum theory is completed, however, some questions 
connected with the correspondence principle have to be considered. 
Firstly, Dirac’s matrices a, ft may be introduced by writing 

(i - —► 0*00; v —► (6.2 o) 

in the Hamiltonian. The velocity, however, occurs also in the interaction 
term in a more fundamental way, for, according to (5.29), 

X r -<Q-» 2 ( B *(k) exp (-ik.x,/#) + B(k) exp (*k. *,/<)). exp 

* b(k) - B*( - k) + B(k), (6.ai) 

where B(k) may be supposed to contain the positive frequencies, and 
B*( -k) the negative frequencies of the Fourier component b(k). Here it 
would be impossible to substitute from (6.20), and in applying the corre¬ 
spondence principle one therefore writes 

▼ -► pi (6.2 a) 

This is strictly correct for the unperturbed electron, and it would be point¬ 
less to introduce a correction for the interaction here, as terms in av/c are 
already omitted from the Hamiltonian. 

The order of the factors in (6.21) has been deliberately chosen to secure 
an expression which is hemiitian and at the same time conforms with the 
usual rule that the factor representing negative frequencies 

shall be written first, and the factor B(k)**’ 1 ^*, representing positive 
frequencies, last. The same rule is commonly applied to the term 

iJ(E* + - 2 {fi*<k).&(k) +^B*(k).B(k)j, (6.23) 

to prevent the appearance of the divergent “zero-point” energy V \ck. 

k 

The Hamiltonian for an electron in interaction with a radiation field is 
then 

H - 2 B * * B ) + +«• (Po - *A*)}, (6.24) 

with A r given by (6.21). 

The problem of the interaction of a number of electrons with the 
radiation field may be treated similarly, and it is well known that such a 
problem may be reduced to the single-body problem through the intro¬ 
duction of second quantization of the electron field as well as of the 



134 


H. S. Green and K. C. Cheng 


radiation field. The above results show that the customary theory is 
modified only by the presence of a factor exp{(k•p 0 ) i /(wV + /2)-£•}/4^ , 
in the matrix element representing the absorption or emission of a photon 
with momentum k by the electron. Here f p 0t (mV 4 -/ 5 )*] is to be 
interpreted as the energy-momentum of the electron in the initial state 
when the photon is emitted , in the final state when it is absorbed . 

Thus the effect of the substitution of the reciprocally invariant 
Lagrangian operator for the usual one in the theory of the interaction of 
electrons and photons is summarized in the following simple prescription: 
to the matrix elements of the perturbation energy representing the emission 
of a photon with momentum k by an electron with momentum p 0t or the 
absorption of a photon with momentum k by an electron with momentum 
Po“k, join a factor exp {(k.*p o y/(mV+pl) It will be found 

that this prescription makes many of the divergent terms of ordinary 
quantum electrodynamics finite. 

APPENDIX I 


Let 


then 


G(s)-£ 


Proof of the Relation 

(-) m 2 m rn\ i 


(2 m + i)l (2 s + i)s\ 


(s-m) I (2 tn + i)l’ 


m = 0 

Writing g m (. s ) in terms of g n (s + 1), one has 

G(s) - £g m (s + i){(r + •}) - (m + J)} 

0 

={£(* + 1) -*, + i(r + i)}(j + H) + G(s) -g,(s) - 


2 (s + 1)! 


or 


G(s + i)(r + 5 ) -g,(s) + (i + 2 )f 1+ i(f +1) + 


2(f+ i)l 


Substituting g m (s) into (iii), one has 

<7(j + 1 ) 


(ar + 3)(j + i)l 

Hence by mathematical induction one obtains the required result. 


0) 


(ii) 


(iii) 



The Reciprocity Theory of Electrodynamics 


135 


APPENDIX II 


Proof of the Relation 


y Hm+j) _WJV 1 L._ni)l 

,^tr(j+m+i) i -aj\r(^,+i+i) rwJ 


For n=o, it is easily seen that 

r(t) _» f r(}) sT(\) \ 

r(j+ 1 ) i-2r\r(j + i) r(s+i)J’ 

so that (i) is an identity. 

Now the difference 

y I>+1) _ f !>+}) 2 f TQi + 3) 1V + }) \ 

“or(j + « + J) £oF( s + m + I ) i-iArV + fi + a) F(fi + r + i)/ 


2 I>+2) 

1 -2S r(fl + S + 3) 


0 i+f-0t+j+i)} 


iy+t) 

Ityi + r + a)’ 


identically. The relation is therefore established. 


APPENDIX III 

The correction is obtained by the substitution of A\ given by (4.32), 
(4.33) and (4.35) into the following expression:— 

/100 d \ 

-e 0 (E x +V A HJe) - e 0 \- +-</>! - V A — A KJeJ. (i) 

The terms which are proportional to the zeroth power of r in U k give for (1) 



136 H . 5 . Green and K. C. Cheng 

while the terms which are proportional to r give 


\efae~*( 2ir)" 1/, | -(▼•▼)▼ + A V A V + v )* + Jw“VC* •▼)(♦•▼)▼, 

L c* £* 


-i(V.V){ 


+i5^V(v-v)V + is^V(^-^) T +g^(^* v )^ 


+ —~ y 7 (^«V)*V + y®(^ • V)v + i5f-*y*^* v ) v 

Sr 4 4^* 

+^v 7 (V.v)(vv)v + ♦)(♦•▼)▼+^V(v-v)»v|j 

1 7 dxf d*x k d*x k \\ 

- §4*^(2")' - ds ? + 8^* * \ 4 a ‘ ds t ))• 

Summing (2) and (3), one has the total correction 

,, 9 (d*x k d*x k \dx\ 




(iv) 


APPENDIX IV 


In this appendix consideration is given to a type of Langrangian 
L(x, x t , x t , . . . ) depending on all time derivatives x l = x, x t =A v , . . 
of a co-ordinate x, but leading to a field equation 


£L _d 8 L d* SL 
8 x dt ctoCj + dl 1 8 x, 


which has the property that its solution is uniquely determined by the 
values of x and & only at any instant. As is well known, the Hamiltonian 


H - L +/o* x +Pi*% +/**s + • • • > 

8 L d 8 L d* 8 L 
pn ~te^~dtdx M + Tfid^; t ~ •' • 


(«) 


is defined in such a way as to satisfy the conservation law dHjdt=o by 
virtue of the field equation (i). The field equation can then be reproduced 
by the elimination of the momenta p n from the Hamiltonian equations 


x 


3 H 

dp. 




J dH 8 L 


(iii) 



The Reciprocity Theory of Electrodynamics 


137 


0 H 


. 0 H 0L 

rn" A*-l (*■!» 2, . . .). 


dx n dx n 


For this purpose the x n have to be treated as independent variables, 
but as it is here supposed that only x and x t are independent, it must be 
possible to express the Hamiltonian in terms of x and p only, where p is 
the momentum conjugate to x . Assume, then, that x n =x n (x t p), so 


0H /0L 0L dx y 0L 
&x \0x + 0JC! dx + dx % dx 


dx x &x t 


?P* .tyi 


+ f 'Tx+ f 'te + ■ ■ ■ 


dL dL 

Eliminating —, -—, etc. with the help of (iii) and (iv), one has 

OX OXj 


( . dx l . dx t \ 

-\ h+Pl Tx +p '~^ + 


dpo . d/i. 

+ ~x + ~x x + . 
dx dx 1 


dPn fyn . dX n Bx n 

On replacing p n by —x + ~p and jfr* by — x + — p, the terms in A 
... ox op ox op 

cancel, leaving r 

— l (yii\ 

dx ” ~ p \\dx dp dp dx) \dx dp~ dp dx} ' ’ ' / 1 ; 

In a similar way it may be shown that 

8H J( — 8/>0 8 xd Po\,(dXi dpi dfx dPi\, \ ..... 

dp dp dp dx) \dx dp dp dx) ’ ’ / 


Then, if 


[*> /ol+ [*i» /il + [•*»> + • • 

the Hamiltonian equations 


da db da di 
[a> *’’dxdp~dp~dx 


dU . dU . 

<» 

are satisfied simultaneously. 

The relation (ix) is therefore the generalization of the usual commutation 
rule for such systems; it is, in fact, equivalent to the relation [ x , P] — i 
which is normally associated with the equations (x). 



I 3 » 


The Reciprocity Theory of Electrodynamics 


REFERENCES TO LITERATURE 

Bopf, F., 1940. Ann. d. Phys., xxxvm, 345. 

- , 1943. Ibid., xui, 575. 

Born, M., 1909. Ann. d. Rhys., xxx, 1. 

-, 1934. Proc. Roy . Soc., A, cxliii, 410. 

-> 1939. Proc. Roy. Soc. Edin., A, lix, 219. 

Born, M., and Green, H. S., 1949. Proc. Roy. Soc. Edin., A, lxii, 470. 

Born, M., and Infeld, L., 1934. Proc. Roy. Soc., A, cxliv, 425. 

Born, M., and Rumer, G., 1931. Zeits.f. Phys., lxix, 141. 

Dirac, P. A. M., 1938. Proc. Roy. Soc., A, clxvii, 148. 

- , 1942. Ibid., clxxx, 1. 

Heitler, W., 1944. The Quantum Theory of Radiation, Oxford University 
Press. 

Heitler, W., and Peng, H. W., 1942. Proc. Camb. Phil. Soc., xxxviii, 296. 
Herglotz, G., 1903. Nach. K. Ges. IViss. (Gdttingen), 357. 

Krazer, A., 1904. Verb, des 3. Internationalen Math.-/Congresses (Heidelberg). 
McManus, H., 1948. Proc. Roy. Soc., A, clxxxxv, 323. 

Pais, A., 1948. Developments in the Theory of the Electron, Publication of 
Princeton Institute for Advanced Studies. 

Podolsky, B., 1942. Phys. Rev., lxii, 68. 

Pryce, M. H. L., 1938. Proc. Roy. Soc., A, clxviii, 389. 

Schwinger, J., 1948. Phys. Rev., lxxiii, 416. 

-, 1949. Ibid., lxxv, 651. 

Sommerfeld, A., 1904. Nach. K. Ges. IViss. (Gdttingen). 

Wkisskopf, V., 1934. Zeits.f. Phys., lxxxix, 27. 


(Issued separately May 14, 1951) 



( 139 ) 


XI.—Application of Relaxation Methods to Compressible Flow 
past a Double Wedge.* By A. R. Mitchell, Ph.D., and 
D. E. Rutherford, Dr.Math., D.Sc., United College, 
University of St Andrews. (With Seven Text-figures.) 

(MS. received July 18, 1949. Revised MS. received February 4, 1950. 

Read February 27, 1950), 

Synopsis 

The relaxation technique of R. V. Southwell is shown to be applicable in certain 
cases to transonic problems. For a uniform stream with a low subsonic velocity im¬ 
pinging on a symmetrical 2-dimensional double wedge, an asymmetrical supersonic 
region can be isolated in the neighbourhood of the comer of the wedge, and the stream¬ 
lines and the values of the Mach number within this supersonic region can be determined 
with the aid of relaxation methods. Difficulties must be expected to occur in the neigh¬ 
bourhood of the sonic line, but in the present problem these have been surmounted. 

i. Introduction 

ALTHOUGH considerable progress has recently been made in the hodograph 
plane, it seems unlikely that analytical solutions will be obtained in the 
near future of problems concerning a compressible fluid, in which both 
subsonic and supersonic regions occur. Considerable importance must 
therefore still be attached to numerical methods and to their development. 
The results described in the present paper are part of a wider investigation 
undertaken by one of us (A. R. Mitchell) with a view to discovering how 
far the relaxation technique devised by R. V. Southwell can be extended 
to problems of the type described above. The present paper treats mixed 
flow past a double wedge by relaxation methods. All the numerical 
calculations were carried out by A. R Mitchell 


2. The Relaxation Method 


The equations of 2-dimensional steady motion for the irrotational flow 
of a non-viscous compressible fluid may be written in the following form 
(Green and Southwell, 1943):— 

V^-^x-o, (0 



W 


* Assisted in publication by a grant from the Carnegie Trust for the Universities of 
Scotland. 




140 A. R. Mitchell and D. E. Rutherford\ Application of 

In these formulae, if* denotes the stream function; % stands for p~* t p 
being the density; y denotes the ratio of the specific heats; and c is the 
local speed of sound. c 9i % t are the values of c t x respectively at a stagnation 
point. 

It will be advantageous for our purposes to write these equations in 
the non-dimensional forms (Green and Southwell, 1943): 

(3) 

To do so, we select some significant linear dimension h pertaining to the 
specific problem under consideration, and write 

G 

vtm psS 

in which G is the mass flow per second under free stream conditions. 
Changing our notation slightly by writing x, x,yfo r the non-dimensional 
quantities x!x*> Vl^ respectively, equations (1) and (2) take the 

non-dimensional forms (3) and (4) 



(-a.-b) (a,-b) 

Fig. i 

In order to use the method of relaxation the whole field of the fluid is 
covered by a regular network. Although in this paper we shall only use a 
network of equilateral triangles, we think that it is worth while pointing 
out that such a network is only a particular case of an isosceles triangular 
net. A typical group of six isosceles triangles is illustrated in fig. 1. 
By varying the angle 0, the shape of such an isosceles triangle may 
be varied to suit the requirements of a particular problem. It will be seen 
ihat the square and triangular nets described by Southwell are particular 
cases (0=45°, 6o°) of this more general net. The effectiveness of this 
network, however, may be impaired if 6 has a value outside the range 
45 0 < 0 < 75 0 , the reason being that in such cases the six nodes of a 
typical hexagon, shown in fig. i, are not the six nodes of the net which 
are closest to the centre of the hexagon. 



Relaxation Methods to Compressible Flow past a Double Wedge 141 


The finite difference approximation to equation (3) applicable to a net 
of isosceles triangles may be shown to be 

F * s -*> +&• -i)g *<*< -*>-* (5) 

in which xo *l*i denote the values of x> 0 at the node labelled j(» =0, 1, 2, 

. . . 6) in fig. 1. It can also be shown that the approximate equation 
corresponding to (4) is 

yf/ 1 - *> ,(1 ■ ,) )“ V ^ L V‘ + i i .S ‘ a,M } 

+ lii + 1 § U4>i ~ 20o) }- (6) 

To obtain an approximate solution for any particular problem, the 
finite difference equations (5) and (6) must be satisfied approximately 
throughout the given region, either by trial and error, or by a mathematical 
plan called by Southwell (1946) a pattern. In practice, the least laborious 
method is to employ a combination of both methods. 

Initially, a value for 0 is assumed at each node of the net. Substituting 
the appropriate values into equation (6), a relation is obtained from which 
X may be determined at any node. Equation (5) now determines the 
residual F at each node. The procedure thereafter is to modify repeatedly 
the ^-distribution, either by trial and error or according to a recognized 
plan, so that the residuals at each node are made as small as possible. 

In the case of the equilateral triangular net, which is the appropriate 
one for the problem here investigated, b=a%l 3. Writing A = 2a = 2bj* i /$ t 
and defining R by the relation 

equations (5) and (6) become respectively 

^0- 2 , Xityi-'k b)-°. (7) 

*o- (y -jr^ Uo'+l Z -*&)}• (») 

A l 6 1,*. M.M j 

Wc shall find it convenient to express the relationship between R and x in 
the form x = /C^)i the graph of which is exhibited in fig. 2. It is easily 
verified that 

f ,,n X* _ 

This relationship is graphed in fig. 3a and fig. 3 b for y= 1*4. 


( 9 ) 



A R. Mitchell and D. E. Rutherford , Application of 



20 x 25 30 3-5 


Fig. a 

Our next task in working out the pattern is to determine the changes 
BF lt . . . SF t consequent on an assumed modification of fa. The 
required formulae, whose derivation will not be given here, are 





Relaxation Methods to Compressible Flow past a Double Wedge 143 

+ < j - 1, • • -,6). (11) 

These formulae are obtained in the same manner as the corresponding 
formulae for the square network were obtained by Green and Southwell 
(1943). As we have said, the original values if/ being known at each node, 



X can be determined. Accordingly f\R) can be determined from figs. 3 a 
and 3 b at each node. From the formulae (10) and (11), the modifications 
S/'ot ■ • • > arising out of an increment Ufa in if/ 0 may then be evaluated. 
By a judicious choice of 8 fa t the residual F 0 may be almost eliminated, 
although this will in general be accompanied by a slight increase in the 
residuals at the surrounding nodes. 





144 -< 4 . R- Mitchell and D. E. Rutherford , Application of 


When this process has been effected at all nodes at which outstanding 
residuals occur, the resulting ^-distribution is taken as a basis for a net of 
finer mesh. The whole procedure is then repeated until the residuals 
remaining may be considered negligible. 

In the problem described in § 5 there would seem to be an inherent 
difficulty in the employment of the relaxation pattern in the neighbourhood 
of the wedge apex, since x has a singularity at the apex. Since </> is 
known at all the nodes on the surface of the wedge, there is no difficulty 



in determining \ from equation (8) at all interior nodes, including the 
layer of nodes nearest to the boundary. We cannot, however, determine 
E t at these "nearest" nodes, since formula (7), when applied to such nodes, 
would require a knowledge of x on the boundary itself. On the other 
hand, F 0 can be obtained from (7) at the "next nearest" nodes to the 
boundary. The relaxation pattern can accordingly be employed to deter¬ 
mine an improved value for ^ at all interior nodes with the exception of 
those nearest to the boundary. 

Since the "next nearest” layer can be brought as close as we please to 
the boundary by choosing a fine enough mesh, the singularity of x at the 
apex need cause no concern. It should be borne in mind, however, that 




Relaxation Methods to Compressible Flow past a Double Wedge 145 


the resulting approximate values of tfi at the "nearest’* nodes will be less 
accurate than those at other nodes, for the degree of approximation 
obtained at "nearest nodes" will be that corresponding to a mesh half 
as fine as the mesh actually used. 

3. Supersonic Flow 

It has been suggested by other authors (Fox and Southwell, 1944; 
Southwell, 1946; Emmons, 1944) that certain problems in supersonic 
flow may not be amenable to relaxation treatment. In this section we 
shall attempt to show that many problems in supersonic flow can be 
examined just as easily as in the subsonic case by the relaxation technique. 
To explain the nature of Southwell’s reluctance to use the method of 
relaxation when the flow is supersonic, it is necessary to refer to the 
formulae (9), (10) and (i 1). 

If we tabulate the relation (9) for the case y = 1*4 as under, 


X 

1*00 

1-15 

1-20 

1-25- 



2-00 

270 

3-50 

M 

■ 

o -75 


1*0 - 

I'O + 

I ’4 


B 

3-0 

rw 

| >-25 

6*70 

B 

+ 00 

- 00 

-14 


B 

-236 


it will be observed that for supersonic speeds (M > 1), f\R) has large 
negative values, large in comparison with the positive values which it has 
for subsonic speeds. From the occurrence of f\R ) in formulae (10) and (11), 
Southwell concluded that for a Mach number M greater than unity very 
large modifications 8 F S C/=o, 1, . . .6) would be occasioned by small 
increments 80 q, and that consequently the approximations yielded by the 
relaxation method would not be convergent, or would, at best, converge 
very slowly. The discontinuity in f{R) at M— I clearly indicates that the 
relaxation method will break down at sonic velocity, but in several problems 
which we have attempted no insuperable difficulty was found, except in the 
immediate neighbourhood of the sonic line, in applying the relaxation 
technique to supersonic problems with M < 3. 

It will be observed that in formula (10) there arc two terms which 

contribute to the value of hF % . So long as the term involving f\R) is not 

e 

of a greater order of magnitude than the term or der of magnitude 

<-1 

of f'(R) itself need not cause concern. In such problems as we have 




















146 A. R. Mitchell and D. E. Rutherford , Application of 

examined, it has been found that the term in /'(/?) did not outweigh the 
other term. 

For example, at one node in whose surrounding hexagon a Mach 
number of 0*9 occurred, we found that 

w -+ 

+ 6 ' 82 - 

i 

and consequently that 

8/^0- -8*oi. 

At another node in whose surrounding hexagon a Mach number of 1*05 
occurred, we obtained 

2 - w - -15*8*1 

i 3 ^ 

+ 8 ‘ 2 3 > 

i 

S/’o“+7*59- 

Analogous results were obtained for each &F/. It may be added that the 
two examples given above were specifically chosen as the most unfavourable 
cases in the problem under examination for the finest net used. It should 
be pointed out, however, that the employment of a still finer net would 
have produced less favourable nodes, in whose surrounding hexagons 
0-9 < M < 1-05. 

In conclusion, it would seem that although in certain supersonic 
problems the relaxation technique may become quite unworkable on 
account of very large values of yet in many cases this is not so. There 
are certainly many problems in supersonic flow which are amenable to 
relaxation methods. 

4. Methods available near a Sonic Line 

It will be evident from what has been said in the last section that, on 
account of the singularity of f'(R) at x~ *' 2 5 (^ =I )> the relaxation 
technique, as described above, must break down in the neighbourhood of 
the sonic line. As the sonic line is approached from either side, /'(JP) 
takes increasingly large positive or negative values. Equations (10) and 
(11) show that under such circumstances negligible changes in the assumed 
^•distribution give large changes in the residues. In consequence, the 
relaxation pattern becomes unworkable as a computational problem 



Relaxation Methods to Compressible Flow past a Double Wedge 147 


The difficulty mentioned above may be seen in another way. For low 
subsonic or high supersonic flow, the modification in 0 is made solely with a 
view to the elimination of the residuals F. In such cases the appropriate 
value of x can be read off from fig. 2, either a value less than 1-25 in the 
subsonic case or one greater than this value in the supersonic case. As, 
however, the sonic line is approached from either side, it will be found 
in general that a modification of 0 may result in a value of R which is 
greater than the maximum possible one of 0*067. In such a case (cf. fig. 2) 
no real value can be attached to \• The problem therefore resolves itself 
into finding a modification of the ^-distribution, which, while producing 
values of R which at each node are less than 0*067, at the same time 
eliminates the residuals F . 

The sonic line is initially, in point of fact, an undetermined boundary 
between the subsonic and supersonic regions. Its position can only be 
determined by approaching it from either side asymptotically, by employing 
smaller and smaller hexagons, and by modifying the ^-distribution subject 
to the stipulations concerning R already mentioned. This was eventually 
accomplished by a method of trial and error. So far wc have been unable 
to devise any pattern which may be used ir\ the immediate neighbourhood 
of the sonic line. The usual relaxation pattern will certainly break down 
in such regions. 

Whether or not the procedure described above is a feasible one in every 
problem we are unable to say, but in several transonic problems which 
have been examined no insuperable difficulty has presented itself. 


5. Problem of the Wedge 

The particular problem described in this paper is that of a 2-dimensional 
compressible, irrotational, non-viscous flow in a channel, past a symmetrical 
double wedge, whose profile is shown in fig. 4. For simplicity in calcu¬ 
lation the semi-angle of the wedge was taken to be 30°. 

In the free stream a low Mach number 0*205 (v=o*20o) was chosen 
with a view to investigating how a supersonic region would develop 
around the apex P , even when the free stream speed was low subsonic. 
In addition, the choice of such a low free stream speed emphasizes the fact 
that fine details of the flow, in particular the flow around P t may be 
examined by reducing the mesh of the net appropriately. 

The channel width 2 OR was chosen to be five times the wedge width 
2 OP. With this ratio the solution of the problem should be a close 
approximation to that of the flow past a wedge in a free stream. There is 



148 A. R. Mitchell and D. E. Rutherford , Application of 

little doubt that in the essential details of the supersonic region there is 
no significant difference between the two cases. 

It was further assumed that the flow across ST upstream of the wedge 
was uniform, S being defined by the ratio SO '.QO — 3. It did, in fact, 
eventually transpire that the flow was essentially uniform much closer to 
the wedge than this boundary. 

Initially a symmetrical solution was assumed as has been done by 
other authors (Emmons, 1944; Jones, Bright and Andrews, 1946; and, 



by implication, Thom and Klanfer, 1947), but the assumption of symmetry 
was at a later stage discarded for the reasons discussed in § 6. These 
assumptions arc sufficient to determine ip on the closed rectilinear boundary 
TT'S’Q'PQS. With a view to applying the methods described in the 
preceding sections, the significant linear dimension h , which was intro* 
duced in order to obtain non*dimensional equations, was chosen to be the 
semi-channel width ST (or OR). For the first and widest net A(=2ajh) 
was chosen to be 1/5, and a corresponding ^-distribution was evaluated, 
which corresponded to subsonic conditions at each node of this net. 
Repeated refinements were made until at A — t /160 we were confronted 
with the difficulties mentioned in § 4, associated with values of R greater 
than 0'067. 




Relaxation Methods to Compressible Flow past a Double Wedge 149 


A further refinement A = 1/480 was made incorporating the methods 
suggested in § 4, and by this means a provisional supersonic region around 
P was isolated. It was found, however, that large residuals persisted on 
the line of symmetry PR , and that these residuals could not be eliminated 
by any of the methods which we have described. 



At this stage, therefore, the assumption of symmetry was abandoned, 
but we retained, as a basis for further calculations, the portion upstream of 
the sonic line of the subsonic ^-distribution which had already been 
calculated. The flow picture up to the line of symmetry is shown in 
figs. 5 and 6 




ISO A. R. Mitchell and D. E. Rutherford , Application of 


After much manipulation, a ^-distribution was obtained stretching 
downstream from the line of symmetry PR. This was done in a manner 
similar to that employed upstream of the line of symmetry The residuals, 



including those on the line of geometrical symmetry PR, were now every¬ 
where small. Fig. 7 shows the asymmetrical supersonic region in the 
vicinity of the wedge apex 

It might also be pointed out that the relaxation methods here described 
may be used to determine the flow characteristics in a supersonic region, 
should they be required., At every node in the supersonic region the local 



Relaxation Methods to Compressible Flow past a Double Wedge 151 


Mach number and flow direction are known. Consequently, the directions 
of the two characteristics may be calculated at each node, since the angle 
between either characteristic and the direction of flow at a point in a 
supersonic region is given by sin -1 (1 jM). By interpolation, the directions 
of the two characteristics at every point of the supersonic region may be 
found. Accordingly, by starting anywhere on the boundary of the 
supersonic region, the two characteristics at the point, one of each family, 
can be traced throughout their length. This was in fact done, but in this 
problem no interesting features emerged. 



It was not considered profitable to obtain a detailed solution for the 
subsonic region downstream of the line of symmetry, since its determina¬ 
tion entailed no novel features. 

It should be emphasized that we do not claim any great degree of 
accuracy for the portion of fig. 7 which lies between the stream-line 
^ = •010 and the boundary of the wedge, nor for the very small correspond¬ 
ing region in fig. 6. The reason for this is explained at the end of § 2. 
Further, since the curvature of the stream-lines in this region is not obtain¬ 
able with sufficient precision from our calculations, we do not know at 
what point the curvature of the lines M= constant becomes pronounced so 
that they meet the boundary normally. That the lines M =constant do 
behave in this way can be deduced from the formula tan fl = Rqlq\ in 
which q is the speed, R is the radius of curvature of the stream-line and 0 
is the angle between the stream-line and the line M— constant. The very 




152 A. R. Mitchell and D. E. Rutherford , Application of 


rapid increase in R as the boundary is approached accounts for the fact 
that (3 may change from, say, 40° to 90° in a very short distance. 

6 . Symmetry 

In this section we shall discuss in greater detail the question of 
symmetry raised in § 5. 

For the sake of argument, let us assume that the flow i9 symmetrical 
about the line PR. Since the velocity must be continuous at points on 
this line of symmetry, at such points the velocity must be parallel to the 
free stream and to the channel walls. Consider now a supersonic region 
surrounding the apex bounded by a line of constant Mach number M 9 
where M* > 1. So long as the speed is increasing, the stream tubes must 
diverge. It is readily seen that it is impossible to insert stream-lines 
within this region, which diverge from this line M~M' right up to the 
line of symmetry, and which at the same time cross the line of symmetry 
normally. Consequently, the assumption of a symmetrical flow implies 
that on any stream-line passing through the supersonic region there must 
be two points, symmetrically situated on either side of the line of symmetry, 
at which the speed attains a maximum value. 

Although there arc indications that this situation may arise in certain 
cases of flow about a blunt comer, it was considered unlikely that in our 
problem a maximum speed would occur in front of the line of symmetry. 

The assumption of symmetry was therefore abandoned, and an 
asymmetrical solution was sought in which the maximum speed occurred 
downstream of the line PR. 

There are additional arguments to support this point of view. Consider 
two points Z and Z 9 symmetrically situated within the supersonic region, 
with respect to PR. The situation at Z ' downstream of Z is of course 
directly affected by the conditions existing at Z. On the other hand, the 
situation at Z can only be affected by that at Z 9 if the signal travels first 
of all downstream into the subsonic region, then travels backward within 
the subsonic region until it reaches a point upstream of Z, and finally 
travels downstream until it reaches Z. Thus, although the circumstances 
at each point can affect those at the other, they do so in essentially different 
ways. This argument indicates that it is at least probable that the flow 
pattern is asymmetrical. 

Finally, in the well-known Meyer expansion, the flow is not symmetrical 
about the line of symmetry of the boundary. Since the flow in the im¬ 
mediate vicinity of the apex must resemble very closely a Meyer expansion, 
we conclude that the flow is certainly not symmetrical at the apex itself. 



Relaxation Methods to Compressible Flow past a Double Wedge 153 


There are therefore good reasons for supposing that in mixed or super¬ 
sonic flow a symmetrical solution will not be obtained, even when the 
boundary possesses a line of symmetry. 

It was in fact found to be impossible to eliminate the residuals along 
PR when symmetry was assumed, and it was this fact which first led us to 
examine the assumption of symmetry more closely, and finally to discard 
it. It should be emphasized, however, that all the above objections to a 
symmetrical solution break down if the fluid is everywhere subsonic. As 
the free stream Mach number in our problem is very small, the presence of 
a supersonic region and its lack of symmetry may be attributed to the 
sharp comer at the apex of the boundary. 


7. Uniqueness of the Solution 

So far as we are aware there has been no rigorous demonstration that 
th“r- is a unique solution of the fundamental equations (1) and (2) which 
satisfies given boundary conditions. 

Once the assumption of symmetry is abandoned in the present problem, 
the boundary conditions consist of ift=o along the channel axis and wedge 
sides, ift = t along the channel wall, and free stream conditions at entry to 
the channel, infinitely far from the wedge. Conditions of flow are not 
specified at any position downstream of the wedge. We arc concerned 
with a mixed continuous flow, the only supersonic region being situated 
round the wedge apex. 

Courant and Friedrichs (1948, p. 370) suggest that in such a mixed 
flow problem the above boundary conditions are sufficient, if continuity 
is assumed, to determine the flow uniquely, although there are some 
indications that a slight change in the shape of the wedge would alter the 
flow considerably. Certainly we were unable to obtain a solution other 
than, but differing slightly from, the solution obtained in § 5. 

Thus, while we would hesitate to assert categorically that the solution 
exhibited in figs. 5, 6 and 7 is unique, we hold nevertheless that there are 
good grounds for believing this to be so. 

8. Conclusion 

The most significant feature of the foregoing study would seem to be 
the possibility of using the relaxation technique in certain problems of 
mixed and supersonic flow. Admittedly, difficulties will be encountered 
in the neighbourhood of the sonic line, and these must be overcome by 
the method outlined in § 4. 



154 Relaxation Methods to Compressible Flow past a Double Wedge 


The possibility of treating supersonic problems carries with it the 
ability to determine graphically the flow characteristics by the methods 
of relaxation. 

We should also draw attention once more to the asymmetrical character 
of the solution obtained. Like many other authors before us, we first of 
all jumped to the conclusion that a symmetrical solution could be found. 
Later wc were forced to reject this hypothesis in favour of that of an 
asymmetrical solution. 

Lastly, the isosceles triangular network described in § 2, while not 
utilized in the present paper, may be used to minimize computation in 
problems, in which the boundaries do not lie neatly on an equilateral 
triangular net. The isosceles network would, for example, have been 
advantageous in the present problem if the semi-wedge angle had been, 
say, 50°. 


REFERENCES TO LITERATURE 

Courant, R., and Friedrichs, K. O., 1948. Supersonic Flow and Shock 
Waves , New York. 

Emmons, H. W., 1944. “Numerical Solutions of Compressible Fluid Flow 
Problems”, A.R.C. Report 7871. 

Fox, L., and Southwell, R. V., 1944. “On the Flow of Gas through a Nozzle 
with Velocities exceeding the Speed of Sound”, Proc. Roy. Soc., A, clxxxiii, 
38 - 54 - 

Green, J. R., and Southwell, R. V., 1943. “ High Speed Flow of Compressible 

Fluid through a Two-Dimensional Nozzle”, Phil. Trans., A, ccxxxix, 
367 - 386 . 

Jones, M., Bright, P., and Andrews, J. B., 1946. “The Squares Method 
applied to Compressible Flow with a Shock Wave”, R.A.E. Technical Note 
Aero 1833. 

Southwell, R. V., 1946. Relaxation Methods in Theoretical Physics , Oxford. 

Thom, A., and Klanfer, L., 1947. “Some Arithmetical Studies of the Com¬ 
pressible Flow past a Body in a Channel”, A.R.C. Report 11010. 


{Issued separately May 14, 1951) 



XII.—Clebsch-Aronhold Symbols and the Theory of Symmetric 
Functions.* By H. W. Turnbull and A. H. Wallace, 

The University, St Andrews. 

(MS. received September 13, 1949. Revised MS. received January 25, 1950. 

Read February 6, 1950) 

Synopsis 

A square matrix A=(a if ) is expressed symbolically in terms of Clebsch-Aronhold 
equivalent symbols a u = a t a f = =. . ., and the symbolic, expressions for symmetric 

functions of the latent roots of A are considered, the relation between these functions and 
projective invariants of the bilinear form uAx being noted. The Newton and Drioschi 
relations between the symmetric functions are obtained by reduction of symbolic deter¬ 
minants and permanents respectively, and the Wronskian relations are shown to be 
equivalent to certain identities between determinants and permanents due to Muir. 
Also the fundamental theorem of symmetric functions is obtained symbolically as a 
consequence of the first fundamental theorem of invariants. The paper concludes with 
a note on the symbolization of the A-bialtemants, that is of the traces of irreducible 
invariant matrices of A. 


i. Introduction 

The Clebsch-Aronhold symbols were introduced by the authors from 
whom they derive their name in their papers on invariant theory (namely, 
Aronhold, 1858; Clcbsch, 1861 a, 6 ). Symbolic devices of a similar nature 
had already been used by Sylvester: in his paper "On the Principles 
of the Calculus of Forms” (Sylvester, 1852), symbols, which he called 
umbrae, were used in the discussion of commutants; while in another 
paper (Sylvester, 1851) a symbolic notation for the elements of a deter¬ 
minant is introduced which is identical with the notation used in the 
present paper to represent the elements of a matrix. 

The usefulness of these symbols in various connections is due to the 
fact that a number of results in algebra are really substitutional in 
character, sets of suffices or variables being permuted in the various terms 
of the identities involved. But very often these sets of suffixes or variables 
contain repeats which made the manipulation of substitutional operators 
on them inconvenient. Clebsch-Aronhold symbols act as position- 
markers, and the substitutional operations may be transferred to them 
from the suffix—or variable—sets, while, by having a set of equivalent 

* This paper was assisted in publication by a grant from the Carnegie Trust for the 
Universities of Scotland. 



156 H. W, Turnbull and A . H. Wallace , Clebsch-Aronhold 


symbolizations, it is ensured that the objects now being permuted contain 
no repeats. 

As used in invariant theory, the symbols replace the coefficients of 
the ground forms by sets of symbolic vectors, and so the problem of finding 
all concomitants of a set of ground forms is reduced to that of finding all 
invariant polynomials in the elements of a set of vectors, either variables 
or symbolic coefficients, all of which undergo either a certain linear trans¬ 
formation or the transformation contragredient to it. And the first 
fundamental theorem of projective invariants shows that all such invariant 
functions are expressible rationally and integrally in terms of inner 
products of the type 

m 

4-1 

and bracket factors of the types 

(ifiDgCt) . . #0*)) and (x U) ^ (t) . . . 

where u and the u {i) are cogredient row vectors, x and the x ii) are column 
vectors contragredient to them, and the bracket factors represent i«-rowed 
determinants (Turnbull, 1945, PP- 173-180, and Chaps. XI and XII). 

In particular the discussion of the invariant theory of a single form 
bilinear in two sets of m variables is given by Turnbull (1932), where the 
coefficients of the form 

m 

uAx~ ^ *i a <s x i 
U.-l 

are written as 

a a " a i a j ■■ PA “ yi c i ■• • • 

these all being equivalent symbolizations. In the course of the work 
certain results of the classical theory of symmetric functions are derived, 
and expressed, by means of the Clcbsch-Aronhold symbols. It is the 
aim of this paper to push that investigation somewhat further. 

In § 7 below use is made of a result in the theory of double standard 
forms which requires some explanation. 

A Young tableau of shape (A) consists of a set of n symbols, not 
necessarily all distinct, arranged in rows and columns, with A I symbols 
in the first row, A, in the second, A, in the third, and so on, where 

(A) 2 (A lt A t , A t , . . ., A*) 
is a partition of n t such that 

Aj + Aj +. . . + A^"^, 

Ai >A t >. . . >A*. 



Symbols and the Theory of Symmetric Functions 157 

Also the first column is fully occupied with its h symbols. For example, 

* * * 

corresponding to the partition (3, 2, 2) is the shape * * The 

* * 

tableau is called standard if no column contains any symbol twice, and 
if, reading along each row from left to right or down each column, the 
symbols appear in some preassigned order. 

If a set of r row vectors m, v, . . ., w and a set of the same number of 
column vectors x t y t . . s are given, then the rth compound inner 
product, or bideterminant, of the two sets of vectors, is defined as the 
r-rowed determinant 

(;uv . . . w | xy . . . z) = 

w m w v . . . w m 

If r > m , this vanishes. 

If U is a tableau of shape (A) whose symbols are row vectors, and X is 
a tableau of the same shape whose symbols are column vectors, the double 
form, written {U \ X], is defined to be the product of the compound inner 
products of the sets of vectors in the corresponding columns of U and X . 
Thus 

| u u v x y m | 

■j v w y z l - (uvw | xyz) (uw | yz) v 9 . 

{ w z J 

If U and X are both standard, the form {(/ \ X}h called a double standard 
form. 

It is proved in Turnbull (1945, Chap. XXIII) that double standard 
forms in a given set of variables arc linearly independent, and any non¬ 
standard double form may be expressed linearly in terms of them. In this 
cited chapter, the notation for double forms does not involve curly brackets; 
they are introduced here to avoid confusion between the symbols of forms 
written side by side. 

2. The Symbolization of a Square Matrix 

Let A = [a 0 ] be a square matrix of order mxm in which the i t j th 
element is symbolized by 

a a ” a-iP} * “ fi c i “ ■ • - 

The symbolic expression of a single element needs just one pair of symbols, 



158 H. W. Turnbull and A . H. Wallace, CUbsch-Aronhold 


say a and a\ a product of any two elements needs two such pairs, say 
a, a and 0, b } and so on; for example 

a iflkk ™ a i a sPh&k m ftt&j a h a k ! 

but is ambiguous. Each pair of symbols, as illustrated here, is 

equivalent to, and so interchangeable with, each other pair, English letter 
with English letter and Greek with Greek, such an interchange leaving 
unaltered the value of the expression which is symbolized. The symbolic 
factors are commutative with respect to multiplication, and of course this 
commutative property introduces no danger of ambiguous interpretation; 
in fact ai)a hk could also be written as a i ^ h a i 6 kl and in interpreting this 
symbolic product the factor a, must be taken along with the a h even though 
these symbols do not actually appear together; the preassigned convention 
that a and a arc to be regarded as companion symbols ensures an un¬ 
ambiguous reading of the symbolic product. We shall be dealing here 
with homogeneous polynomials in the elements of the matrix A , and 
the symbolization of each term of such a polynomial, say of degree r, will 
therefore contain r different English letters a, b, c . . and the corre¬ 
sponding r Greek letters a, j8, y. 

It is worth noting that although, in general, a and a differ, being 
symbolizations of a u and a u respectively, they are equal if A is a symmetric 
matrix. Thus for a symmetric matrix the simpler notation 

a ii mi a i a ) bfij =CiCj= . . . 

is available, without the use of Greek letters. 

3. Invariants under the Transformation A=HAH~ l 

A homogeneous polynomial in the an is a symbolic multilinear form 
in the elements of the column vectors 

a-foa, . . . a m }, • • - PJ, 

and the row vectors 

*-[*1*. . . . aj f b-[b x b % . . . bj, 

Under the similarity transformation 

A-If AH-\ 

which involves a non-singular matrix H of order m x m, these symbolic 
vectors undergo the transformations 

a —Ha, P-Hfi, 
a-dH, 1 -bH , 



(1) 



Symbols and the Theory of Symmetric Functions 159 

where a, /5, . . . and &, J, . . . are formed, like the corresponding vectors 
a, 0, . . . and a t b t . . ., from the symbolic form of A , namely from 

da — d*dy — fifBj — . . . 

By the first fundamental theorem of projective invariants, any multilinear 
form a, in the elements of the symbolic vectors 

a, j9, . . a, b t • • -i which is a nilbaric, or absolute, invariant under the 
transformations (1) for all non-singular H t that is a form such that 

. ■ ■ •)“<!><«, A • ■ ■ • •), 

is always expressible as a polynomial in the inner products a af a fit . . 

. C a> C fi . etC ‘> where 

m 

<ii 

But if the dements of A arc taken to be independent variables, it may 
be assumed that the latent roots of A arc all distinct; and so any poly¬ 
nomial in the a it which is invariant under the transformation A = HAII~ l 
is first of all a polynomial in the latent roots of A, as may be seen by 
choosing H such that A is a diagonal matrix. Then since the set of all 
non-singular matrices H includes the set of all permutation matrices, that 
is, matrices obtained from the m-rowed unit matrix by permuting its 
rows, any such invariant polynomial in the a it must be a symmetric function 
in the latent roots of A. Conversely, any symmetric function of the 
latent roots of A must be invariant under all transformations A=HAH~ l , 
since such transformations leave unaltered the set of latent roots as a whole, 
at most permuting them among themselves. 

Hence any symmetric function in the latent roots of A may be written 
symbolically as a polynomial in the inner products 

a a> a t .^«> ^ a » • * ■» 

and conversely, any polynomial in the a ti which, when symbolized, can be 
expressed rationally and integrally in terms of these inner products is a 
symmetric function of the latent roots of A. 

4. Closed, Open, and Prime Products 

A symbolic product of the a a , a f . b a , b f .in which to each 

English letter appearing there is a corresponding Greek letter, and vice 
versa, will be called a closed product. A product such as aj> y or ajb v , or 

itself, where certain letters appear without their companions, will be 
called open. A closed product then symbolizes an actual polynomial in 



l6o H. W. Turnbull and A . H. Wallace , Clebsch-Aranhold 


the a u \ it is indeed a symmetric function of the latent roots of A ; but an 
open product is purely symbolic and has no actual interpretation in terms 
of the a tf . A closed product which contains open factors only will be 
called prime ; while if it can be factorized into further closed products it 
will be called composite . Thus each of a Q , a ^ a , &fp y c ai ... is prime, 
whereas a a b^c fi is composite, having the closed factors a a and b y c fi . The 
value of a prime product depends only on the number of distinct symbol 
pairs occurring—this is a direct consequence of the equivalence of symbol 
pairs; e.g. atfb a = L c ( d y and =a y 6 a c fi- Hence the value of a com¬ 

posite product depends only on the way in which the total number of its 
distinct pairs of symbols is partitioned so as to form prime products. 

This may be expressed differently by regarding a closed product 
involving the n letters a, b> c . . . and their n companions a, jS, y ( . . . 
as the result of operating on the sequence a, b, c . . .in the product 

■ • • 

with a certain permutation of the symmetric group on n letters; then 
the value of the composite product depends only on the class of the 
permutation in question. The product then contains a prime symbolic 
factor involving k symbol pairs, corresponding to each cycle of order k in 
the permutation. Let a prime product of k symbol pairs be denoted by s kt 
and let 

0>)“(Pi» Pv • ■ •) 

be a set of positive integers such that 

Pi + 2 Pi + 3Pa + 

Further, let 

*$0• • • (a) 

It follows that 

^(p)“^(p)*aVv * ■ •> 

where is a permutation, of class (p) t operating on the sequence 
a> b y c . . . Or if is the sum of the permutations of class (p ) in the 
symmetric group on n letters, and is the number of such permutations 
within the class, we may write 

S (0)~ T~ C (P)*«Vy • ■ - (3) 

*(p) 

(The change from Frobenius* symbol h w to is made to avoid 
confusion, as h {p) is now used in a different connection.) 

Thus any homogeneous symmetric function of degree n of the latent 
roots of A may be written symbolically as a linear combination of closed 



Symbols and the Theory of Symmetric Functions 161 

products by the first fundamental theorem of invariants. Each of these 
products must be one of the and so may be written in the form (3). 
Thus the given symmetric function is the result of operating upon the 
sequence a, b % c . . . in a a b^c y . . . with a linear combination of the 
operators q* that is, the function may be written as 

• • • 

where X is a substitutional operator and, moreover, is a linear combination 
of the class sums of the symmetric group on n letters. Equally well the 
operator X could act on the sequence a, j 9 , y, . . . of Greek letters instead 
of on the corresponding English ones. 


5. Certain Special Symmetric Functions 


The prime products 

S l na a> S \~ a f>» *1-**V«» 

have already been introduced, but it remains to evaluate them in terms of 
the latent roots of A. To do this, consider the 1,/th elements of the 
matrices A t A* A % . - .; they are, respectively, 


a i a i* o^pbii a i a flbfi » • • ■ 

by the ordinary rule for matrix multiplication. Putting/=1 and summing 
with respect to *, it follows that the traces of these matrices are, respectively, 


that is, 


a a t a (fa » » 


x r ■ trace of A T - tr (A*), say. 


By Sylvester's theorem, the trace of A r is the sum of the rth powers of the 
latent roots of A, and so this gives the value of s r . 

Or, independently of Sylvester’s theorem, s r may be evaluated by using 
the fact that it is invariant under similarity transformations; thus A may 
be taken in the first place to be a diagonal matrix, with diagonal elements 
. . . Then 


ai S =8ijWi' 

In this case, Ji (taking r=3 for convenience in writing) is given by 

f, - a fat a - 2 a iP<*n'l i k a t’ 

the summation being with respect to j, j and k independently, each from 



162 //. W. Turnbull and A . H, Wallace , Cltbseh—A ronhold 


i to But 

- 2 "?- 

And in general, as before, 

iml 

Now consider the symbolic bideterminants 

^ = (a | a) = a a ; M x = (pi | ajB) = | ““ 

a fi * T 

= (*** I a £y) = <*« ^ ; etc -> 

'« <> ^ 

where M r is an rth compound inner product or bideterminant. These 
are symmetric functions of the latent roots of A, for they are obtained from 
the product ... by determinantal permutation , and 

are hence sums of closed products. These ^/-invariants may be evaluated 
as in Turnbull (1932, p. 6), or again by supposing A to be diagonal, with 
Then, for r = 3, say, 

^3“ j 'a^ 7 “ 2 * A* Ay*- 

On account of the determinantal permutation (a sum of six terms) indicated 
by the dots over a , 3 and c, non-zero expressions are obtained in the 
summation with respect to t\ j and k only if these indices arc all distinct. 
Each set of three distinct indices appears in all possible orders, namely 
3! times, and each time the only non-zero contribution is a>i<t>jW k . Hence 

where e s is the third elementary symmetric function of the latent roots 
of A. In general, 

e Ty 

where e r is the rth elementary symmetric function of u> m . 

A third set of symmetric functions is produced by permanental sum¬ 
mation applied to a a bp a a b fi c y1 . . . etc., namely 

+ + 

= | a„if | s ajfi + afa, 

+ + 

s I y I s ««Vy++ a J<fl +*vV«. 



Symbols and the Theory of Symmetric Functions 163 

and so on. N f will consist of r\ terms, each with a positive sign (whereas 
M f consists of the same r I terms, half with positive and half with negative 
signs). N r may be called a bipermanent, and an appropriate notation is 

N t -(ab || aft), 

N % -(abc || afiy), 

and so on, where the double vertical line replaces the single line in M f . 

To evaluate the N r > suppose, as in the case of the M t , that A is diagonal. 
Then for r = 3, say, 

+ + 

summing over all sets 1, j t k , in all possible orders. In general, fixing 
attention on a certain set of r suffixes attached to the r letters a, 0, y, . . ., 
suppose that this set contains p equal suffixes of one kind, q equal suffixes 
of a second kind, s of a third kind, and so on. Then, in the summation 

with respect to sets of suffixes there are / r \ arrangements of this 

\Pt • ■ ■ / 

particular set. For each such arrangement the permanental summation 
with respect to a, 6 , c . . . gives a non-zero contribution only when the 
arrangement of the suffixes of a, b, c . . . coincides with that of the 
suffixes of a, j8, y, . . . (since A is diagonal); but since the sets of repeated 
suffixes may be permuted among themselves this coincidence may happen 
in p \ q\ j! . . . ways. Hence the non-zero contribution to N r due to this 
selected suffix set is r! times a certain product of degree r of the latent 
roots, a**, of A } and the summation with respect to suffix sets becomes a 
summation over all possible products of degree r of the a>,. In fact, 

N f ~r\h T y 

where k T is the rth complete homogeneous function of the latent roots 
of A. 


6 . Relations between Symmetric Functions 

It has already been shown (Turnbull, 1932) that the ^/-invariants and 
the first m of the s r each form a complete and irreducible set of invariants 
for a single bilinear form uAx t where u and x are contragredient vectors. 
Here the variable vector x is contragredient also to the symbolic vector a t 
and cogredient with the vector a. In the present paper these results, and 
certain others, will now be derived, the emphasis here, however, being put 
on the fact that the invariants discussed are symmetric functions of the 



164 H. W. Turnbull and A . H. Wallace , Clebsch-A ronhold 


latent roots of a matrix, rather than on the fact that they are projective 
invariants of a certain bilinear form. 

The theorems which are about to be proved on irreducible bases for 
symmetric functions give an illustration of the working of the second 
fundamental theorem of invariants, which states that every identity among 
invariants, expressed symbolically, can ultimately be expressed by means 
of certain fundamental identities together with the principle of interchange 
of equivalent symbols and the laws of ordinary algebra for the manipulation 
of symbolic inner products and bracket factors (Turnbull, 1945, p. 214). 
In the present instance these fundamental identities are equivalent to 
stating that any compound inner product M r of order r > m vanishes 
identically. 

If, now, any compound inner product M r is expanded as a determinant, 
there will be (r — 1)! terms, each with the same sign and each equal to s rt 
since, in the symmetric group on r letters, there are just (r - 1)! permutations 
each consisting of one cycle of order r. The rest of the expansion is a 
polynomial in the s { for i < r. If r > m , M r = o, and so, for r > tn 9 the 
identity 

o-(r-i)!j r +^(x„ s r _i) 

holds, tft being a polynomial. Hence s r may be expressed as a polynomial 
in the functions s t for * < r; and by repeating the process, an expression 
is obtained for s T as a polynomial in the s t for * < m. By the very nature 
of the symbolic expression of symmetric functions, any symmetric function 
of the latent roots of A may be written as a polynomial in the functions s it 
and by the result just obtained this polynomial may be reduced to a 
polynomial in the s t for » < m. Alternatively any homogeneous symmetric 
function of degree n of the latent roots of A may be written as a linear 
combination of the functions 


where p r is zero whenever r > m. 

In Turnbull (1932) it is shown that a symbolic reduction of the deter¬ 
minants! expression for M r (r < m) leads to the formula 


-Mr- 


(r-,)t (,-1)| 


M r -\S\ --—— -Jf T ^ t s t + . . . +(-)*■ l s r . . . (4) 


(f-a)l 


t.e. 


• • • +(-) r ' l Sr • • • ( 5 ) 

To prove this result, consider first the expansion of the symbolic deter¬ 
minant of r rows and r columns 



Symbols and the Theory of Symmetric Functions 


165 


{acde ... | / 9 ySe . . .) — 


a fi a ? a i a , • • ■ 

Cf Cy Cl C, ... 

dp dy d I d. ... 

e fi e y e i ... 


in terms of its first column and cofactors; here each English letter except 
a has a companion Greek letter, and the English letter appears corre¬ 
sponding to each Greek letter except / 3 . Now 

{acde ... | ) 3 y 8 c . . .) - a fade . . . | ySc . . .) -cfade . . . | yBe . . .) 

+ dface ... | ySe . . .)-efacd . . . | ySe ...)+... 

“ a fade ... | ySe . . .) -cfade . . . | ySf . . .) 

+ cfade ... | 8yc . . -)-cfated . . . | «8y . . .) + . . . 

by the interchange of pairs of equivalent symbols. Then, by rearranging 
the rows and columns of the determinants on the right, 

{acde ... | aySc . . ■ ) “ a fade . . . | ySc . . .) —(r— 1 )cp{ade . . . | yS« . . .) (6) 

where the cofactor of a f is a sum of closed products, while the cofactor of 
Cp is similar to the original determinant, namely, with all the English and 
Greek letters paired off except the first in each set. Similarly if a is 
replaced by b, 

{bede ... | fiyBc . . .)~bp{cde . . . | y8« . . .) ~{r~ \)cp{bde . . . | y8e . . .), 
i.e. 

M r - -{r-i)cfade . . . | ySe . . .). 

Then, by applying the reduction formula (6) to (bde ... I ySe . . .), 

A/,--(r - \)cp\b y {de . . . | B« . . ,)-{r-2)d Y {be . . . | Sc . . .)] 

- - (r - i)r 1 A/' r _ 1 + (r- 1 ){r - i)tfa y {be . . . | Sc . . .). 

And so, by repeated application of the reduction formula (6), and final 
division by (r-i)!, the equation (4), or the equivalent equation (5), is 
obtained. 

The equations (5) for r= 1, 2, . . . »*may be solved to give s lf s t . . . s m 
as polynomials in e l , e, . . . e m . Hence any homogeneous symmetric 
function of degree n in the latent roots of A is expressible as a polynomial 
in the elementary symmetric functions . . e m or, in other words, as 



166 H. W. Turnbull and A. H. Wallace, Clebsch-Aronhold 

a linear combination of the double standard forms 

| S$}, 

where and Sfy are tableaux of the same shape (A) (with not more 
than m rows) formed from English and Greek letters, respectively, in 
which the letters appear in alphabetical order, reading along each row 
in turn from left to right. As an illustration of the last statement, the 
product e x e* % e % may be expressed symbolically as 


abed 

a fl y 8 

e f k 

t <f> K 

( 

A 


Any polynomial identity existing between «? lf e t . e m immediately 

implies a linear relation between the functions say 

2 ww“°- ( 7 ) 

<M 

Double standard forms constructed from sets of independent variables are 
known to be linearly independent, and this leads one to guess that the c (x) 
in the last equation are all zero. A little care, however, is always necessary 
in passing in this way from independent variables to equivalent symbols, as 
the principle of interchange of pairs of symbols often introduces linear 
relations between quantities which are otherwise independent. The 
vanishing of the r (A) may be proved as follows. By the principle of inter¬ 
change of equivalent symbols, each * (A) in (7) may be replaced by ^ (x) , 
which is defined as (1 fnl) times the sum of all the double forms obtained 
from {S$ | S$} by permuting the pairs of symbols (a , a), (b, /?), . . . in 
all possible ways. Equation (7) now becomes 

( 8 ) 

and here the symbols a, / 3 , . . ., a, b, . . . may be treated as independent 
variables, the symmetry of the with respect to the different pairs of 
symbols replacing the principle of interchange of pairs of symbols. The 
vanishing of the c (A) now follows by induction. In fact, a tableau of shape 
(A) is said to precede a tableau of shape (/x) if the first row of the former 
which is not as long as the corresponding row of the latter is shorter than it. 
The hypothesis of the induction is that ^=0 for all shapes (ji) following 
the shape (A). But c M is zero if (ji ) is the latest possible shape, namely 
the one-rowed tableau, as may be seen by putting a=b = c= . . . and 
a=/J=y= . . .; for this causes all the functions to vanish except that 
corresponding to' the single-rowed tableau. Then if any one of the 



Symbols and the Theory of Symmetric Functions 


167 


double forms in the expression is selected, and all the symbols 
in each row separately of its tableaux arc made equal to one another 


so that, for example 


fabc a j 3 y\ 
1 \de Sc j 


becomes 


a a a a a a I 


vanishing double form will be obtained; but all the for shapes (ji) 
earlier than (A) will be made to vanish by this coalescence of symbols, 
while the other forms appearing in <^will either vanish or become equal 
in value to the selected form. And so equation (8) implies that ^ (A) =o, 
which completes the induction. 

Hence the functions e lt e t . . . e m form an irreducible rational integral 
basis for symmetric functions of the latent roots of A . And on account of 
the mutual expressibility (using equations (5) for r=i, 2 ... m) of the 
set e l9 e t . . . e m and s Xl s t . . . the latter set is also an irreducible 
polynomial basis for symmetric functions. 

The result just proved is, of course, a well-known one, but the method 
of proof is worth noting, for it illustrates a general method of reducing 
problems in equivalent symbols to problems in the manipulation of sets of 
independent variables. The result itself illustrates the second fundamental 
theorem of invariants, which in this case states that, apart from the identical 
vanishing of the M r for r > m, any identity in symmetric functions depends 
entirely on algebraic manipulation of the inner products a a , a flt . . 
b af b fi9 . . . And so, on the basis of this theorem, the vanishing of the 
c (A) in (7) follows at once. This method of proving the vanishing of the 
c^ as a consequence of the second fundamental theorem of invariants is, 
however, logically unsatisfactory, as the proof of this theorem is a matter 
of some difficulty. 

The relations between the e 4 and the s f given by equation (5) are nothing 
else than Newton’s relations between sums of powers and elementary 
symmetric functions; thu 9 these relations, and also the classical funda¬ 
mental theorem of symmetric functions, which states that the elementary 
symmetric functions form an irreducible polynomial basis for symmetric 
functions, have been derived by symbolic methods, using symbolic deter¬ 
minant reduction and the first fundamental theorem of projective in¬ 
variants. There is thus a close link between the symbolic theory of 
determinants and that of the elementary symmetric functions e t together 
with the s { . This suggests that a corresponding link may subsist between 
the symbolic theory and that of the third well-known set of basic symmetric 
functions, namely h f s N r )r\ The h r have been shown to be representable 
as symbolic permanents; so the reduction of such a permanent will now be 
considered. 

The working follows exactly the lines of the derivation of the Newton 



168 //. W. Turnbull and A. H> Wallace % Clebsck-A ran hold 


relations, except that changes of sign are not involved when rows or 
columns of a permanent are interchanged. Starting with the symbolic 
permanent 

a p a y a h . . . 
c p c y € h . . . 

(aede . . . || /3yS« . . .) = df d> d * 


of r rows and r columns, a reduction formula corresponding to (6) is 
obtained, namely 

(atde ... || j3yS« . . .)^a^ede . . . || yS« . . .) +(r - \)c p (ade . . . || ySe . . .). 

Then by repeated application of this reduction formula, 

N T = (bcde . . . || 0y8e . . .) 

=b f (cde . . . || ySc . . .)+(r- i )e fi (bde . . . || y8e . . .) 

“ri^V,_x + (r- i)Cf{J>de . . . || ySe . . .) 

-s 1 N T -i+('-i)stW r -i + {r-i)(r-*)cpd Y (be . . . || 8« . . .) 

“ + (r - i)VW-* + (r-i)(r- 2 )s a N r _ 3 + . . . +(r- i)l s r . 

Hence 




i.e. 

*■ ^ ... + jf. 

And these are Brioschi’s relations between the sums of powers and the 
complete homogeneous functions A f . 

Newton's and Brioschi’s relations may be solved, giving expressions 
for the Si in terms of the e { and h u respectively, or in terms of the Mi and 
N u respectively. The latter expressions are 

s i “ M x ■ N lt 

r,- + - tft-JV*, 


2 I 


It may be verified by induction on f, after differentiating the ith 
Newtonian or Brioschian relation with regard to the first i of the 



Symbols and the Theory of Symmetric Functions 


169 


Mi and N i% respectively, and adding the results, that 

2^'-^-° 21^-x^r-o (*>»)■ (9) 

/-i j-i 

Then, since the Mi appear as coefficients in the characteristic function 
of A t namely 

/(A)- | X/-A |-A" , -A/,A m - 1 + —A"-* . . (10) 

3 I 

and since also 

1 1 N x N % 

f{\) " A" + A m+1 + a I A m+1 + ' ’ " ’ 

it follows that the are seminvariants (of equal degree and weight) 
belonging to the binary mic (1, M lt M t . . . M m ^X l9 X^) m and the binary 

perpetuant (i, N Xi N ti . . . §X lt XJ* separately. 

Incidentally, it is worth remarking that the differential equations (9) 
may be regarded as the annihilation of the s t by each of the Sylvester 
operators 

Q -2^-ia? “'-2/^. 

or equally well by the Hammond operators 


i Ce i 


^ 1 “ 2^-x 


dh< 


(Hammond, 1882; also cf. MacMahon, 1915, I, 27). The polynomial 
expressions for the s t in terms of the e t and h ( are called protomorphic 
seminvariants {cf. Elliott, 1913, p. 212). Alternative solutions of the 
differential equations (9) exist. For instance, instead of x 4 may be taken 

8*4 - &c x e 9 + 4*i* t - e\ - r| - 2*4, 


with a similar expression in the h it giving rj + 2*4. This illustrates the 
fact that complete systems of protomorphs, such as the s t , are not necessarily 
unique. 

One further result of the classical theory of symmetric functions 
remains to be discussed, namely the Wronskian relations between the *< 
and the h x which arise by multiplying together the series expansions (8) 
and (9) of/(A) and i//(A) and equating the coefficients of powers of A to 
zero. These relations, written in terms of the Af { and the N it are 

M x -N x -o, 

M t -iM x N x + N t -o, 

M t - zM t N x + 3 M x N t -N t —o, 



170 H W. Turnbull and A. H. Wallace, Clebsch-Aronkold 
and in general, say, 

(A /- JV) r =M r -rM r _ x N x + - . . . +(-)’7V r -o, 

1.2 

with binomial coefficients (the corresponding relations between the e t and 
the hi have unit coefficients). Writing these symbolically, 

(a | a) -(a || a) -o, 

(ab | afi) - 2(a | a)(a || a) + (ad || afJ) - o, 

{abc | aj 3 y) - $(ab | aj 3 )(c || y) + $(a \ a)( 6 c \\ fi y) -(abc || a0y) = o, etc. 

But these arc the fundamental identities between determinants and 
permanents discovered by Muir (Muir, 1897). In fact, we may write 
any one of them, say the third, in the more usual form, 

I a aVy I “ 2. I II C r I + 2 I a “ II Vv I ~ I *«Vv l“°* 

where the summation is taken over all the different combinations of the 
pairs of symbols ( a , a), ( b , j8), (c, y). The parallelism between Muir’s 
results and the relations of Wronski has been clearly brought out by 
MacMahon (MacMahon, 1924; also MacMahon, 1927, p. 279; Little- 
wood, 1940, pp. 119-120), but the symbolic method shows it to be inevitable 


7. Symbolic Expression of A-bialternants 

The symbolic double standard form {S$ | S$} has already been 
mentioned (§ 6). The polarized double standard form { 5 $ | 5 $} is 
defined as the sum of all the double forms obtained by permuting in all 
possible ways the symbols within each row of S$ independently and 
adding up the results. Since {S$J 5 $} is obtained from . . . 

by operating on the sequence a, b, c, . . . with a certain substitutional 
operator, namely the Young operator E of the tableau S$ {E is one of the 
E ^ of Rutherford (1948, p. 16)), this symbolic polarized double form is a 
symmetric function of the latent roots of A. 

The symmetric functions { 5 $ | 5 ^} are of importance in the theory 
of invariant matrices, being, in fact, equal to numerical multiples of the 
traces of the irreducible invariant matrices of A. It may be quite simply 
shown from considerations of the structure of these invariant matrices that 

(iz) 





Symbols and the Theory of Symmetric Functions 171 

where is the A-bialtcrnant 


^+1 

... k^ x k ^ A Ai+1 

■ • • h ^-\ K x 

in the latent roots of A ; 0 ^ is a positive integer which may be defined by 
the equation E x = ti^E satisfied by the Young operator (Rutherford, 1948, 
P- 19 ). 

Equation (12) may also be obtained in the following way, as was shown 
by Ur D. E. Rutherford in 1946.* 

In the equation 

| *$(£)} . . . 

on account of the interchangeability property of equivalent symbols, 
E may be replaced by 0E0 - 1 , where o is any permutation belonging to the 
symmetric group on the n letters a, b t c 9 . . . But oEa~ x is the Young 
operator corresponding to a tableau of shape (A) in the a t b t c 9 . . . obtained 
by performing the permutation a on the symbols of S$- And so 




(the summation being over all elements of the symmetric group on n letters) 


(Rutherford, 1948, p. 65). Here T w is a substitutional operator, 
introduced by Young, and defined as 

y<*) 

ZtS‘, 

ini 

where the n\ substitutional operators are basal units for any set of 
irreducible representations of the symmetric group and /* A> is the number 
of standard tableaux of shape (A) in n distinct symbols. From the theory 

* Turnbull communicated this formula, which he had verified for simple cases by 
elementary methods, to the Edinburgh Mathematical Society in the session 1945-46, and 
at a seminar at St Andrews, when he asked for a formal proof. The above proof was 
supplied next day in a letter by Dr D. E. Rutherford. 



172 H. W. Turnbull and A. H. Wallace, Clebsch-Aronhold 


of group characters (Rutherford, 1948, p. 66), 

(fl 1 ) 

where x£j » the character of a permutation of class ( p ) in the.representation 
corresponding to partition (A). And so, since /* A # A) =n I (Rutherford, 
1948, p. 65), 


xfcfoA.) (§ 4, equation (3)). 

But since, from the Frobenius theory of group characters, 

n 1 ^ 00 " 2 x8^w*^(p) 

GO 

(Littlewood, 1940, p. 86), equation (12) follows. The result is implicit also 
in Theorem II of Young's Fourth Memoir (Young, 1929, p. 259). 

A simple example of (12) involving three pairs of equivalent symbols is 



; '}♦{: * 



A q h x 


The expanded form on the left is which is i(j}-J»), 

which again is equal to the 4 -bialternant on the right, and indeed also 
to V1-V1 


REFERENCES TO LITERATURE 

Aronhold, 1858. “Theorie der homogenen Funktionen dritten Grades von 
drei Ver&nderlichen l, ,/e?«r»./tfy Math lv, 97-191. 

Clebsch, 1861 a. “Uber eine Transformation der homogenen Funktionen 
dritter Ordnung mit vier Veriindcrlichen”,/turn, ftir Math., lviii, 109-126. 

-,1861 b. “Uber symbolische Darstellung algebraischer Formen ”, Joum m 

fUr Maik., Lix, 1-62. 

Elliott, 1913. Algebra of Quantics, Oxford. 

Hammond, 1882. “On the Calculating of Symmetric Functions”, Proc. London 
Math . Soc., ser. 1, xm, 79. 

Littlewood, 1940. The Theory of Group Characters , Oxford. 

MacMahon, 1915. Combinatory Analysis , Cambridge. 

-, 1924. “Researches in the Theory of Determinants”, Trans . Cambridge 

Phil . Sac., xxiii, 89-135. 



Symbols and the Theory of Symmetric Functions 173 

MacMahon, 1937. “The Structure of a Determinant”, Journ. London Math. 
Soc., n, 273-386. 

Muir, 1897. “A Relation between Permanents and Determinants”, Pros. Roy. 
Soc. Edin., xxii, 134-136. 

Rutherford, 1948. Substitutional Analysis, Edinburgh. 

Sylvester, 1851. “ On the Relation between the Minor Determinants of Linearly 
Equivalent Quadratic Functions”, Phil. Mag., ser. 4,1, 295-305, 415; Coll. 
Math. Papers, 1, 341-350, 251. 

-, 1852. “ On the Principles of the Calculus of Forms ”, Cambridge and Dublin 

Math. Journ., vii, 52-97; Coll. Math. Papers, I, 284-327. 

Turnbull, 1932. “The Invariant Theory of a General Bilinear Form", Proc. 
London Math. Soc., ser. 2, xxxm, 1-19. 

-, 1945. The Theory of Determinants, Matrices and Invariants (2nd Ed.), 

London and Glasgow. 

Youno, 1929. “On Quantitative Substitutional Analysis” (Fourth Memoir), 
Proc. London Math. Soc., ser. 2, xxxi, 253-288. 


(Issued separately May 14, 1951) 



( >74 ) 


XIII.—Studies in Practical Mathematics. VI. On the Factoriza¬ 
tion of Polynomials by Iterative Methods.* By A. C. 
Aitken, D.Sc., F.R.S., Mathematical Institute, University of 
Edinburgh. 

(MS. received March 27, 1950. Revised MS. received July 31, 1950. Read June 5, * 95 °) 


Synopsis 

The method of iteration of penultimate remainders, introduced by S. N. Lin for 
approximating by stages to the exact factors of a polynomial, is subjected to theoretical 
analysis. The matrix governing the iterative process is obtained, and its latent roots and 
latent vectors are found. Incidental theorems yielding further factorizations are proved, 
and processes are developed for accelerating convergence. Numerical examples illustrate 
varying situations likely to arise in practice. 


i Preliminary 

The purpose of this paper ia to supply the necessary theory and to extend 
the scope of an iterative method (Lin, 1941) for approximating by stages 
to an exact factor of a polynomial. Other writers (Fry, 1945; Friedman, 
1949) have discussed the method. Friedman has introduced a procedure 
which he describes as a modification of Lin's; but in fact, though both 
methods make use of iterated polynomial division, Friedman's method is 
his own, and rests on a different basis from that of Lin. 

We examine the process ab initio , and extend it to divisors of arbitrary 
degree. We regard it from the point of view of repeated linear trans¬ 
formation of a vector, namely the vector of small errors or deviations from 
the final exact coefficients. The linear operator that emerges is a matrix 
R of special type; expressions are obtained for its latent roots and its latent 
vectors. Not only the dominant latent root, on which the convergence of 
the process depends, but the corresponding latent vector reveal themselves 
in the course of the arithmetic and provide (§ 7) valuable additional in¬ 
formation. It will appear also that certain processes for accelerating 
convergence, familiar already in other applications of vector iteration, are 
again available here; and we introduce a new and simplified version of 
one of these, adapted to the case where the dominant latent roots of R are 
a conjugate complex pair. 

* This paper was assisted ia publication by a grant from the Carnegie Trust for the 
Universities of Scotland. 



VI. On the Factorisation of Polynomials by Ittrativs Methods 175 


2. The Penultimate Remainder Polynomial 
Let the polynomial f H (x) proposed for factorization be 

/«(*)-+ . . . +(-)X. •***<>, ( 1 ) 

~(*- 0 i)(*- 0 |) . . . (*-a„), ajO] . . . 0**0, (2) 

where the divisor polynomial (we shall later specify which particular 
divisor) 

^m(*) -- V™- 1 + V m ~* - • ■ ■ +(-)"*^« (3) 

- (* “ Pl)( x - Pt) ■ ■ • (x- Pm), ( 4 ) 

and the corresponding quotient polynomial for this divisor is 

fn-m(x) . . . +(-)«-"•<■„_„ (5) 

»(*-yi)(*-yi) • • • (*-y«-m)- (6) 


The above is a resolution into exact polynomial factors. In practice, 
however, the factors of f n (x) are not known, and the problem is to obtain 
them. For this purpose we begin by taking a trial divisor of degree m, 
let us say 

'»(•*)- - bi'x’*- 1 +- . . . ( 7 ) 

presumed to be an approximation to some exact divisor d m (x). Using 
the routine of polynomial division, in a form adapted for machine (§ 5) if 
one is available, we divide /„(*) by t m (x). The division is stopped not, as 
usual, at the final remainder, but at the stage immediately preceding, 
where the current remainder is of the same degree, m, as the divisor. 
This particular remainder will be called the penultimate remainder (p.r.) 
of f n (x) with respect to t m (x). When divided by its leading coefficient, so 
that the highest term becomes x m , the p.r. yields the reduced penultimate 
remainder (r.p.r.). 

The method now to be investigated is that of iteration of the operation 
of penultimate remaindering, each successive r.p.r. being taken as a fresh 
divisor. It will appear that under certain conditions the sequence of r.p.r. 
polynomials converges to an exact factor d„(x) of /„(*). There are in 

general such exact factors of degree m, real or complex; each is 

characterized (§ 8) by stability or instability, in the sense that a slightly 
displaced divisor yields an r.p.r. in the one case nearer to d m (x), in the other 
case further away from it. The criteria of stability (§ 4) depend both on 
the divisor djpc) and on the quotient 

The coefficients both in quotient and in p.r. are determinants of a 



176 


A. C. A it ken, Studies in Practical Mathematics 


special bigradient, almost persymmetric recurrent, type not unfamiliar 
in the literature. When f n (x) is divided by t m (x) the quotient is as follows: 


X n ~ m + 


I 

1 





( 8 ) 


while the form of the p.r. may be sufficiently indicated by the first few 
terms in the case n - m = 3, namely 


I tfj Of 

1 b; b a ' 
I V b t ‘ 
I V 


x m + 


i «| a, a t 

1 V b t ' b t ' 
I V V 

X V 




I fll 

I v V v 

I w 

I 3 ,' 


.«*•-*+ ... (9) 


Dividing the p.r. by its leading coefficient, presumed not to vanish, we 
have the r.p.r. 


3. The Linear Operation on the Vector op Errors 


Let b t '=bt + £<, and let it be assumed that, relative to the moduli of the 
coefficients b it the moduli of the errors * t , are so small that powers 

and products above the first degree may be neglected. We proceed to 
study, under this assumption, the transformation of errors produced by the 
p.r. operation. It will appear that this transformation is represented by a 
matrix R of m rows and m columns. The elements of R are determinants 
of varying order. To write R out in extenso for general n and m would 
require a great deal of space; but the fundamental results and their proof 
can be sufficiently illustrated by the case » = 7, m= 4, n-m= 3. 

Let us first examine, in this case, the coefficient of x m ~ l in the r.p.r., 
that is to say, the coefficient corresponding to b x in d m (x). It is 


x a x a t 

*4 


1 a x 

a t 

I + b^ + C| 

* 4 + e« 


1 b^ + b % + f| 

J,+«, 

I ii + tt 

b t + e t 


1 *i+«i 

^•+«s 

I 

*i + «« 


1 

*i+«i 


The deviation c/ 1 =b 1 f - b t is readily found to be 


1 a l 



«4 


I *!+«, 

i, + «. 

*» + «• 

*4 + « 4 


z 

*i + «i 

^i+‘i 

*, + «, 

■r 


I 

*i + *i 

i| + «t 




I 

*1 



*t+«i 

1 


«■ <1 

4 t +« a b t + c, 

bi+e 1 b t + «■ 
1 ^+<1 


<*> 


(a) 



VI. On the Factorisation of Polynomials by Iterative Methods 177 


But, comparing coefficients of powers x 9 in § 2 (1) and the product of § 2 (3) 
and 5 2 (5), we have 

flr“*F + *A-l + 'A-.*+ • • - +*.-A+*p- (3) 


Hence, performing the operation 


row! - row, - c x row, - c t row 4 - r, row, (4) 

on the numerator in (2), with a similar operation on the denominator, and 
retaining terms of the first order only in the e if we have on expansion 


r 




I c x c t . 
i b x b t b % 

I by b t 
I by 


«1 - 


I Cy c t 

i by b x 
I by 


«»+ 


l '1 

I by 




(s) 


An analogous result holds for general values of n and m, and can be 
proved in a similar manner. A similar procedure can be applied to the 
determinant-quotients in the coefficients of the r.p.r. corresponding to 
b t , by and by. The results can be summed up in the matrix operation on 
the vector of errors «, namely 

««-*«, (6) 

where R in our illustration is the matrix 


l Cy c t 

I by by by 

I by by 

I by 


I Cy Cy 


I by by 
I by 


I Cy 

I by 




I Cy Cy . 

I by by by 

I by by 

l by 

l Cy C, . 

I by by . 

I by by 

I by 


1 Cy . 

I by by 
X by 


* 'l 


I by 


(7) 


I Cy . 

X by by 
I by 


I by 




l Cy Cy . 

I by by . 

I bx 

1 by 


I 

I 


C \ ‘ 
by . 

I by 


* *4 



i 7 8 


A. C, At then, Studies in Practical Mathematics 


In the general case R ia a matrix of order m x m f and is derived by 
steps quite analogous to those described above. The general form of R 
can be inferred from inspection of (7). It is chiefly characterized by its 
leading element, the later elements in the first row being principal minors 
of this leading element. At each descent to a lower row the elements in 
the last column of the determinants concerned receive a unit increase of 
suffix. A feature worth noticing is that c n ^ nt represented in the example 
above by c 9r though present in the extraneous scalar factor ( - 
is missing from the elements of the determinants within R. 

Each successive r.p.r. is taken as the divisor for a new r.p.r.; and so 
the vector of errors, or more precisely the vector of the linear components 
of the errors, is repeatedly transformed by R . Hence the condition of 
convergence, under the stated assumptions, is that all the latent roots of 
R should be such that | p 9 | < r. We must next obtain these latent roots 
and the associated latent vectors. 


4. Latent Roots and Vectors of the Matrix of Iteration 

The zeros ft of d m (x) together with the zeros y/ of q n ~ m (x) constitute all 
the n zeros of fjx). It will be shown that in the general case the m latent 
vectors of R are 

l 1 2 a< y'-faPi j'.PtPiPk • * • i, (0 

* Ed Ed M I 

where the bracketed subscript [v] denotes that in the summations (each of 
which yields an elementary symmetric function of a particular set of m - i 
roots ft) one root ft is omitted in each latent vector; or is replaced by o, 
which has the same effect. Thus in the respective vectors we omit in turn 
flu £»i • • -i Am- 

As for the latent roots p F , they are of a rather special nature, most 
simply described as coefficients in reduced penultimate quotients (r.p.q.) 
with respect to linear divisors x - ft. In fact, let the penultimate quotient 
when f n (x) is divided by d m {x) be reduced by dividing it through by 
c n _ nt the constant term of the final quotient with sign changed. 
The result, which we shall call the r.p.q., is 

(-- 1 +- . . . (*) 

We now assert that the latent roots p t of R are the m values assumed by 
this r.p.q. when *=j 3 1( / 3 ,, . . / 3 m . The determinantal expression for 

p t , namely 



VI. On the Factorisation of Polynomials by Iterative Methods 179 


*-(-)—-w- 1 . 


1 <i <■* • 

1 ft • ■ ■ 

I ft • • 

* ft • 


ft 


(3) 


shows how p F is related to the leading element of It also shows that 
p, is the coefficient of x in the r.p.q. arising when g H „ m (x) is divided by 
a:-ft. 

Deferring for the time being the proof of these statements, let the m 
latent roots p F be named in such an order that 


I Pi I > I P* I > I P* I > • • • > I Pm |. (4) 

It is hardly necessary to point out that this order does not usually correspond 
to the descending order of the | fi 9 |. Then the sufficient condition of 
convergence of the r.p.r. process is that | p x | < 1. 

In the not infrequent case where p, is real and | Pi | > I Pa |, convergence 
will be of the kind in which the successive errors of corresponding co¬ 
efficients in the r.p.r.’s, or equally well the first differences of corresponding 
coefficients between consecutive r.p.r.’s, tend to geometric progressions of 
common ratio p x . The known accelerative processes based upon this 
(Aitken, 1926; Steffensen, 1933) are then available, and should be applied 
at as early a stage as possible. 

If, as is also not unusual, p x is one of a conjugate complex pair, the 
familiar features of oscillatory convergence resembling a damped vibra¬ 
tion will be present. Here again a simple accelerative process (§ 6) is 
available. 

If | Pi | > I, the sequence of r.p.r.’s will diverge. In some cases, but 
not all, convergence may be restored by the use of the reciprocal equation, 
that is, by performing penultimate remaindering with reversed order of 
terms. On the other hand it may be advantageous to give up trying a 
divisor m and to choose one of degree m - 1 orw + i. 

We proceed to establish the theorems enunciated above. As remarked 
earlier, to economize in space and to avoid undue prolixity, we illustrate 
by the case n = 7, m =4. We prove first that, for example, 


I 


i e x e t . 


I 

ft+ft+ft 

-'f 1 

1 ft • • 


ft+ft+ft 

ftft+ftft + ftft 

1 ft • 


ftft+ftft + ftft 

ptptpt - 


1 ft 


ftftft 


R 





180 A. C. Aitktn , Studies in Practical Mathematics 

where R is as given in § 3 (7); with three results of similar form. The 
latent root and latent vector are visible in the above equation. 

To prove this result we first refer to § 2 (8), which shows that, apart 
from sign and reversed order, the elements in the first row of R (see again 
§ 3 (7)) are the respective coefficients in successive terms of the quotient 
resulting from the formal division of the r.p.q. c^\x*-Cyx'+Cfc) by 
djfc). Let this r.p.q. be multiplied by 

<* - A)(*-A)<*-A)-* , -<A + A+A>* , + (AA + AA + WJx - AAA. 

and let us suppose the resulting product to be divided by d m (x). The 
coefficient in the fourth term of the quotient so obtained will then be the 
linear combination 


b t b t b % 
1 b x b t 


h 


1 '1 

1 b. 


-1 


I Cy C t 
I by by 

I by 

i 1 A+A+A AA+AA+AA AAA}- (6) 


But since 

A.<*) -(* - AX* 1 - (A+A+A)**+(AA+AA+AA)* - AAA). (?) 

it follows that (6) will be the coefficient in the fourth term of the quotient 
arising from the division of the r.p.q. Cf 1 (x , -c 1 x , + c t x) by r — A: and 
so, again by reference to § 2 (8), it will be 


'i 1 


1 Cy e, . 

r A • • 

i A • 

1 A 


-'•AA’-'iA'+'iA)- 


( 8 ) 


Thus (s) is established, as far as the first row of R is concerned. As for 
the second row of R, we shall indicate determinants by their diagonal 
elements (with the convention that b % =c 0 = 0 and we shall write it in the 
form 


[ - I Cf>ybybyby \ + by\ (Jybyby | | tjhybyby \ ~by\ tjyby \ 

- U.AAI+AkoAI I cj>y | -by]. ( 9 ) 


Applying the same reasoning as in (6), (7) and (8), we find that the second 



VI. Oh the Fattenuation of Polynomials by Iterative Methods 181 


element in J?{i Pt + P* + Pt PtPu + PtPt+P»Pt AAA>« 

i *\ *% • • 

1 Pi 

-'i' 1 * A • ■ 

X Pi 

* A 

-tt'iPt+Pt+PMPi'-tiPt'+CtPd, (II) 

since we have 

*i"0i“0i +03 + 04- Oa) 

In similar fashion the third row of R may be written 

c % 1 [ " I I +^i I *A*A I I I “^t I I 

- | I +^t I I I I ~^i]f (*3) 

and we find in the same way that the third element in 

fi{ J 01 + 03 + 04 0103 + 0104 + 0304 010304} 
is 

*(AA+ PtPu+ AAXA* - *1 A*+'«&). (*4) 

since we have 

A - A(A + A + A)“AA + AA + AA- (iS) 



I *1 *» • 


I Pi 
i A ■ 

1 A 


(io) 


The fourth element is determined in similar fashion. Thus (5) is 
established. The corresponding results for the remaining latent roots and 
latent vectors of R follow at once from symmetry. In the general case 
where R is of order m x m a proof can be constructed following the same 
lines. 


5. Application to Numerical Examples 

It is instructive at this point to work out some numerical examples, 
both with real and with complex dominant latent roots of R. So that the 
nature and rapidity of the convergence shall be more clearly visible, we 
have purposely chosen polynomials /»(*) having factors with integer 
coefficients. 

Example 1 .—To find the real root of Wallis’s equation 
fAx)-x*-2X- s-o. 

By inspection the root is slightly greater than 2. So we transform by 
r=r + 2 and consider the equation 


*'+6x* +1 ox -1 — o. 



182 


A. C. A it ken. Studies in Practical Mathematics 


Note. —In polynomial division by machine a condensed form of synthetic division is 
most efficacious. Nothing need be written down on the computing sheet except the 
coefficients c f of the quotient, obtained one after another, concluding with the final 
remainder. In our modified procedure we should similarly obtain the coefficients of the 
penultimate quotient and penultimate remainder. For example, giving positive sign to 
all coefficients so as to show procedure more simply, if we have to divide 

x n + + a& n ~ % + ... by x m + + . . ., (i) 

we compute and write down in a row the numerical values of 

i, c x 3=1 Cg * Og — ^1^1 “ kg, Cg ■ d| ~ C\lg ■* Cgki ”* • « ■ (2) 

the coefficients of the penultimate remainder automatically filling up the rest of the row. 
Then in a new row we enter the detached coefficients of the r.p.r., the new divisor, obtained 
from the p.r. by multiplying through by the reciprocal of the leading coefficient, then the 
new penultimate quotient obtained as in (2) and then the new p.r., and so on. The 
summing of the signed isobanc products in (2) might be aided by having the coefficients 
1, b\f tg, . . . suitably spaced on a loose card which could be ranged above the appropriate 
C$‘ 

In the present example the computing sheet will read as follows: 


r.p.r. 

quotient 

p.r. 

-o*i 

1 6 

10 

-0*094 

6*i 

io* 6 i - 

-0*0945832 

6*094 

10-5728 - 

- 0*0945497 

60945832 

10*576445 - 

-0*09455x6 

6*0945497 

10576238 - 

-0*09455x5 

6*0945516 

10576249 - 


For example, 10-576445 = 10-(-0-0945832X6*0945832). The p.r. in 
this row is thus 10-576445*- 1; reduced by division by 10-576445 it gives 
the r.p.r. *-0-0945497 for the next division. 

The required root is *=0-0945515, # = 2-0945515. We shall sec in §6 
that with slight increase of work this could be improved by several 
additional digits. 

Note .—It calls for remark that, so far as division by a linear factor is concerned, 
penultimate remaindering is really not new; it is merely a paraphrase of the old-established 
process (Whitiaker and Robinson, 1929) of iteration. Iteration consists in writing an 
equation <p(x) = oin a suitable form /(*) = 7 7 (x), and in building a sequence x k satisfying 
the recurrence f(x t ) =F(x k ^ l ). In the present case the choice is 

/(*)=*, F(x) - ( -- <*!*"-* + . . . (3) 

and the initial x t is simply the b x of g 2. Penultimate remaindering is, step for step, 
equivalent to the following way of numerically evaluating the polynomial within the 
bracket on the right of (3): 


**<**--«■(**- •. ■)»+((4) 

This is of course a well-known way, on a machine provided with a transfer lever from 
product to setting register, for evaluating a polynomial by stages; it requires no inter¬ 
mediate copying. 





VL On the Factorization of Polynomials by Iterative Methods 183 

Example 2 .—We shall try to find a cubic factor of the polynomial 
/ 4 (x) - jc 4 + 17#* + 84* 1 + 148* + 80. 

The very crude choice x* for first divisor would give the last four terms 
for p.r., roughly reduced to ** + $x % +9* 4- 5, and a trial with this again 
suggests x* +6x* + 12* + 7 as a closer divisor. From that point the com¬ 
puting sheet reads as follows: 



r.p.r. 


l 

q- \ 

p.r. 



6-o 

12-0 

7-0 

1 17*0 

84-0 

148-0 

80*0 

6-54545 

12-81818 

7-27273 

11-0 

72-0 

141-0 

80-0 

6-80869 

13-46086 

7-65217 

10-45455 

71-18182 

140-72727 

. . . 

6-92150 

13-77132 

7-84982 

10-19131 

70-539I4 

140-34783 

. . . 

6-96817 

13-90586 

7 93769 

10-07850 

70-22868 

140-15018 


6-98717 

13-96179 

7-97462 

i 10-03183 

7009414 

140-06231 


6-99485 

13-98460 

7-98975 

10-01283 

70-03821 

140*02538 



Here we see a fair convergence to the factor ** + 7#*+ 14# + 8. The 
ratio of convergence is in fact 0-4. The r.p.q. is - x/io , and the zeros of 
the divisor are - I, - 2, - 4. In § 6 we shall use this example to illustrate 
the acceleration of convergence. 

Example 3 .—In the quartic polynomial 

/ 4 (*) “** + + 71** + I22JC +120 

the zeros of the divisor will prove to be complex. We shall try for a 
quadratic factor, beginning with x % + 1*72: +1*7, suggested by inspection 
of the last three terms of the quartic. The sheet reads thus : 

r.p.r. 

1 1-7 i-7 

2-17137 2.66726 
2*22201 3-13270 

2-II63I 3-22126 

2-01260 3-I2528 
I * 97 ° 7 S 3-02087 
1-97415 2-97538 

1- 99000 2-97505 

2- 00073 2-98916 

The convergence towards the exact factor x* + 2x + 3 is clearly seen, 
as also is the oscillation of error produced by the presence of dominant 
conjugate complex roots of the governing matrix R, We shall also use 
this example in § 6 to illustrate acceleration of convergence. 


q- 


p.r. 

16-0 

71-0 

122-0 

14-3 

44-99 

97-69 

13-82863 

38-30567 

85-»1545 

13-77799 

37-25247 

78-83769 

I3-88369 ! 

38-39655 

77-27702 

13-98740 | 

| 39-72368 

78-28546 

14-02925 

40*33099 

7961946 

14-02585 

1 40-33549 

80-26777 

14<OIOOO 

| 40-14505 

80-3I955 




I8 4 


A. C. Aitken, Studies in Practical Mathematics 


Example 4. —Finally, in order to show the emergence of a valuable 
feature not hitherto mentioned, we shall take the same polynomial as in 
Example 3, and shall try for a cubic divisor, beginning with x * +4X* + 8x + 8. 
The sheet reads: 


40 

8-o 

8*o 

5 ‘ 2 S 

9*5 

10*0 

5 " 7 ao 93 

10*41860 

11*16279 

S -89367 

10*78281 

11*67421 

5-95836 

10*91650 

11-87375 

5-98343 

TO 96696 

11-95014 

5-99337 

10*98677 

11*98015 

5-99735 

10*99470 

11*99205 


p.r. 


16*0 

71*0 

122*0 

12*0 

63-0 

114-0 

10*75 

61-5 

112*0 

10*27907 

60-58140 

I10*8372I 

10*10633 

60-31719 

110*32579 

10*04164 

60-08350 

IIO'ia6a5 

10*01657 

60-03304 

110*04976 

10*00663 

60-01323 

IIO-OI985 


Here we see a convergence towards the exact factor x* + 6x* +1 lx +12. 
This factor has a pair of complex zeros, but they arc not dominant in the 
process. The dominant one is x= -4, the r.p q. is - xj 10, and the con¬ 
vergence ratio is thus 0*4. This ratio is clearly seen in the last two p.r.’s 
above; and we take opportunity to remark, without going into details of 
proof, that the coefficients in the p.r. may be shown to have properties, in 
regard to convergence and the vector of errors, quite similar to those of 
the r.p.r. 

The new feature is that the errors in the coefficients of the last r.p.r., 
namely -0-00265, -0-00530, -0-00795, are visibly in ratio 1:2:3. Or 
we may take the differences of corresponding coefficients in the last two 
r.p.r.’s, 0-00398, 0-00793, 0-01190, which show the same approximate 
ratio. The reason is (§ 7) that x* + 2x + 3 is an exact factor of the divisor 
x* + 6x* + 1ix + 12; the other factor x + 4 has been used up, as it were, in 
producing the convergence. 


6. Processes for the Acceleration of Convergence 

It has already been remarked that the nature and rapidity of conver¬ 
gence depend on the latent root p x of R, of greatest modulus. If p l is real, 
and sufficiently separated in modulus from p t , then the first differences of 
corresponding coefficients in the sequence of r.p.r.*s settle down to an 
approximate geometrical progression. The computer, on seeing indications 
of this in the work, will then find it advantageous to stop, to form first and 
second differences of corresponding coefficients in at least three consecutive 
r.p.r.’s, and to use these in the following way to obtain superior 
approximation. 



Vi. On the Factoriaation of Polynomials by Iterative Methods 185 

Let three consecutive members of such a sequence of coefficients be 
named «*- 1( «*, «*+!• Then a better sequence (Aitken, 1926) is 
given by 


«* u - 


**+1 

«t_i f*t 




(1) 


which is very easily computed if a machine is available. A useful variant 
(Steffensen, 1933) is to construct 


** 11 ” «*+i ~ 


(a) 


Yet a further equivalent, which can serve as a check upon calculation 
(not upon convergence), is 


v* 11 = u t - Jwfcdat.i/dV-i- (3) 

A check on convergence is given either by using the similar improvement 
on a neighbouring triad such as u k - „ ®a. or by trying out the pro¬ 

cess on the whole r.p.r. and taking the resulting polynomial as a new 
divisor. 

Example l .—If the working in Wallis’s equation, Example 1 of § 5, 
had been taken to ten significant digits, the last three approximations to 
the real root a, with first and second differences, would have been: 


J J* 

0-09454979283 

178215 

9455157498 -188076 

- 9861 

9455147637 

5x7 — 9861* / 188076 


9455148154 


check by (3) 


9455157498 

“ 9344 “ 


9455'48t54 


178215 x9861 


188076 


The improved value, obtained both by Stcffensen’s adjustment (2) and 
by the variant (3), is shown. Here convergence is so good that it is hardly 
necessary to use an accelerative process. The true value (Whittaker and 
Robinson, 1929, p. 106, footnote) is 0-094551481542 . . . 

Example 2 .—We take the values of the coefficients in the last four 
r.p.r.'s of Example 2, § 5, and apply the adjustment (2): 





186 

A . C . Aitken , Studies 

6*92150 

A A * 

137713* 

6*96817 

4667 

-2767 

13*90586 

6*98717 

I9OO 

- 1142 

13*96179 

6*99485 

768 

i 

t 

13-98460 

516 

“ 768* / 1142 

i 

I 1571 

1 

7-00001 

1 

1 

14-00031 


in Practical Mathematics 


A A * i 

1 

i 

1 

7*84982 

A J* 

13454 


8787 

-7861 

7-93769 

-5094 

5593 


3693 

-3312 

7*97462 

- 2180 

2281 

7-9*975 

I 5 IJ 

«2281* / 3312 

1050 

- 15 * 3 * / 2180 


8-00025 


The improved values 7 0000, 14-0003, 8*0003 are close to the true integer 
values. 

Oscillatory Convergence .—If convergence is dominated by two con¬ 
jugate complex roots of R % corresponding to a certain pair of conjugate 
complex roots of the exact factor that is being found (and we recall that 
this need not be the factor of f n {x) having the numerically smallest roots), 
then we may apply (Aitken, 1926) an accelerative process slightly more 
complicated than (1) but equally effective. We take the opportunity here 
of transforming it into an adjustment resembling Steffensen's, as follows. 

Consider a sequence ...» u k ^ Xf u kt u k+lf . . . the first differences of 
which are settling down, because of the presence of a dominant pair of 
complex roots in R t to a sequence like the following: 

.... a cos a, <zrcos(0 + a), ar* cos (20 +a), or* cos (3#+a), ... (4) 


where r ( < 1) is the modulus and 8 the amplitude of the roots in question, 
and a is a constant angle induced by the initial conditions. Then it is not 
difficult to show that the sequence of persymmetric determinants of the 
second order, easily evaluated by machine, 

Au k _ j Au k 


Au k 

Att k _1 Au k 


is tending to a geometric progression of common ratio r *, whence a 
sufficient approximation to r* may be found for practical use. The 
desired adjustment is then 




** +1 1 


( 6 ) 


A variant, analogous to (3) and serving in the same way as an 
arithmetical check, is 



Au^4u t -t*Au k ^) 




( 7 ) 





VI. On the Factorization of Polynomials by Iterative Methods 187 

Example 3.—We take the coefficients in the last five r.p.r.’s of 
Example 4, § 5, and apply the adjustments (6) and (7) to them: 



A 

A* 



A* 

2*01260 



3*12528 




-4185 

i 


-10441 


1*97075 


4525 

3*02087 


5892 


340 



' 4549 


1*97415 


1245 

2*97538 


4516 


1585 



33 


1*99000 


- 512 

2*97505 


1444 


1073 



1411 


2*00073 



2*98916 




The trend of persymmetric determinants made from first differences 
suggests that r* is approximately 0*318. We may remark that the actual 
value, obtained by substituting the roots of x*-h2x +3=0 in the r.p.q. 
-(**+ 14*0/40, is 513/1600=0*3206. 

Substituting the approximation 0*318 in (6) and (7), we have the 
improved values of the coefficients: 


By (6), 
By (7), 
By (6), 


0-01585(1073 -0-318 x 1585) 

1-99000-- 

-512-0-318x1245 


= 1 - 99993 - 


I-9741S 


0-01585(1585 -0-318x340 ) 
-512 -0-318 x 1245 


= 1 - 99994 - 


a -97538 


- o-Q 4549 (-33+0-318x454 9) 
4516-0-318 x 5892 


2-99952. 


In the case of the second coefficient we cannot make full use of (6) or (7), because of the 
accidental circumstance, purposely exemplified here, that the second difference 1444 is 
very close to 0-318 X4516. 


7. The Asymptotic Vector ok Errors 

We consider next what information can be derived from the vector of 
errors or, equivalently, of first differences of corresponding coefficients in 
consecutive r.p.r.’s, at a sufficiently advanced stage. We confine our 
attention to the case in which the dominant latent root p t of R is real. It 
» well known, in problems concerned with determining latent roots and 
latent vectors of a matrix, that if v is an arbitrary vector then in general 



188 


A. C. Attken , Studies in Practical Mathematics 


R*v tends with increasing / to the dominant latent vector, so that J? e+1 v 
tends to p x R l v. Hence in the present application the ratios of the first 
differences of coefficients from one r.p.r. to the next will tend to the latent 
vector shown in § 4 (1). But the elements of this vector arc coefficients 
of powers of x in a certain divisor of f n (x) and of d m (x) t namely the 
polynomial of degree m - I: 

(1) 

where corresponds to p v 

Thus not only does iterated penultimate remaindering disclose the 
factor d m (x) t but by inspection of vectors of differences in consecutive 
r.p.r.’s we discern the further factorization of d m (x ) into (x - ftXm-iOO. 
This is so useful an addition to the existing resources that it is well worth 
while, if a machine is at hand, to preserve two or three more digits in the 
entries, so that the differences can be more accurately determined. Under 
suitable conditions the limiting ratios of these differences can be found 
still more exactly by the use of accelerative processes like those described 
in §6. 

Example 1 .—The ratios of corresponding first differences in Example 2 
of § 5 are shown below. They are not very well determined, since there are 
only three or four digits in the later differences; we shall, however, apply 
the accelerative process to them, obtaining a fair approximation to the 
coefficients of the true divisor x* + $x + 2. 


Differences : 4667 

13454 

8787 j 

| Ratios: 

1 2-883 

1-883 

1900 

5593 

3693 

i 

1 2-944 

>‘944 

<1 

ON 

OO 

2281 

1513 

! 

; 

Adjusted: 

1 2-970 

1 2-989 

1-970 

1-989 


The first differences of the ratios arc themselves seen to be equal. This 
points to the further factor r+i. 

Example 2.—The above result continues to hold when the zeros of 
contain conjugate complex pairs among them, always provided 
that the dominant latent root of R is real. Below wc show the ratios of 
first differences taken from Example 4 of § 5. If more digits had been kept 
in the working, the accelerative process of § 6 (6) could have been applied. 


Differences: 2507 

5046 

7649 

Ratios: i 2*013 

3-051 

994 

1981 

2991 

1 1*993 

3-009 

398 

793 

1190 

1 1-992 

2-990 


An oscillating tendency towards the coefficients of the exact divisor 
** + 2 x+3 is clearly seen. 



VI. On the Factorization of Polynomials by Iterative Methods 189 


8 . General Observations and Conclusions 

The theory of the iterated penultimate remainder, as expounded in the 
preceding sections, has reference to those stages of the process at which 
the errors have become relatively small enough for their squares, cubes 
or products of higher than the first degree to be neglected. But at the 
outset, in choosing a first trial divisor, we cannot know how small the errors 
are; they may indeed, and often will be, much too large for their higher 
powers to be neglected. The discussion of the r.p.r. sequence in such an 
unrestricted case would require a more general and a more difficult 
analysis; and it would almost certainly be hedged about by a multitude of 
special conditions and exceptions. We have ourselves examined it only 
slightly, but we dare assert in a general way, after the study of numerous 
arithmetical examples, that even in cases where the initial divisor is almost 
fantastically remote from the final exact divisor an initial apparent 
tendency to diverge may be changed in a few steps to a surprising con¬ 
vergence. Any reader sufficiently interested may test this assertion by 
trying, on such a polynomial as 13**+ 34* 1-40, so wild a divisor as 
x % + 100* + 500, or the like. 

It is possible that the situation could be expressed in geometrical 
language somewhat as follows. The set of coefficients in a trial divisor 
may be regarded as a vector or point in space of m dimensions. In this 
r.p.r. space the vectors corresponding to exact factors are points of 
equilibrium, stable or unstable as the case may be, with respect to the 
operation of penultimate remaindering. Considering for example the 
case of quadratic divisors, intuition suggests that the regions of stability in 
the r.p.r. plane are separated by critical curves, along which there is, so to 
speak, equal attraction towards one or other stable point. The strength of 
stability at a point could be measured by the reciprocal of the associated 
| p x | for that region. As we have mentioned more than once, it is not to be 
assumed, even for linear divisors, that the divisor corresponding to the 
zero of f n (x) of smallest modulus corresponds to the point of greatest 
stability. This can be seen from such an example as 

/ a (s)«( 2X - i)(4* - i)(8* + 1 ). ( 1 ) 

The divisors 4 *-i, + i will be found to be stable, while 2 x-\ is 

unstable. However, though 4* - 1 corresponds to a root of larger modulus 
than 8* + 1 does, yet 4# - 1 is the more stable; it will in fact be found from 
§ 4 (5) that the convergence-ratios of the iteration are - £ for the divisor 
4*-1 and for 8* + 1. It is not difficult, by choosing roots with 
suitable variations of sign, to construct polynomials f n (x) having different 



i go A, C. Ait ken , Studies in Practical Mathematics 

degrees of stability among the quadratic divisors, the divisor corresponding 
to the roots of least modulus being less stable than other divisors. We 
may also suggest that besides critical curves separating the regions of 
stability there will be singular points. For example, ** + 8** + i 6* + 5 
has the unstable divisor x 4 - 5; yet the trial divisor x + 3 gives the p.r. 
x + $, exact. The reason is that the penultimate quotient contains both 
x + 5 and x + 3 as factors. 

It has appeared, therefore, that the method of iterated p.r. will not 
necessarily, nor usually, give the roots of a proposed algebraic equation in 
either ascending or descending order of moduli. This is one of the features 
that distinguish it from extant methods such as the root-squaring, or the 
extended Bernoulli method. 

The next question that arises is how to proceed if the r.p.r.’s show 
evident signs of divergence. This will again depend, under the assumptions 
we made in § 2, on the nature of the latent roots of R. It may be (though 
naturally this could hardly ever be known beforehand) that there are 
latent roots of R with modulus >1. In such a case we might try the 
reciprocal equation; but even this might not be successful. The remedy 
might then be to give up trying for a divisor of degree m t and to try others 
of degree m - 1 or m 4 1. Indeed a pedestrian but often effective procedure 
is to try first of all for linear or quadratic factors, and to divide them out 
from /„(*) as soon as they are sufficiently determined. However, most of 
these points are matters requiring practical experience. In any case the 
method of iterated p.r. is a useful addition to the existing methods of 
solving algebraic equations, and would probably lend itself very well to 
“programming” on high-speed electronic machines. 

In conclusion, we hazard a few remarks regarding Friedman’s method. 
This, though doubtless suggested in idea by Lin’s examples, must be 
regarded as essentially distinct and new in principle. It is also, one can 
hardly doubt, more complicated in theory. Friedman's procedure is as 
follows: to choose a trial divisor and to divide f n (x ) by it, right to the final 
remainder stage; then, discarding the remainder, to take the quotient and 
to divide/„(*) again by this, but according to ascending powers of x t until 
a second quotient, of the same degree as the first trial divisor, is reached. 
This second quotient, under certain conditions, is closer to an exact factor 
of /„(*) than the first trial divisor was. Friedman’s proposal is to iterate 
this two-way division and to make a sequence of the second, the back¬ 
ward quotients. Now the coefficients in the forward quotient are (§ 2) 
bigradients; those in the backward quotient will also be bigradients, but 
compound ones, their elements being the bigradients of the earlier set; and 
in the final reduced quotient a division by the leading coefficient, itself a 



VI, On the Factorisation of Polynomials by Iterative Methods 191 

compound bigradient, will have to be applied. It is possible, and examina¬ 
tion confirms it, that the determinant quotients may be simplified to a 
certain extent; yet even so the situation remains complicated. Friedman, 
confining his discussion to linear and quadratic divisors, concludes that 
convergence in these cases, when it exists, is superior to that of Lin’s 
method. This may be so in some cases, but it is clear that the number of 
operations required to obtain each iterate is a shade more than twice that 
required in penultimate remaindering, and the change-over to a reversed 
division makes a break of rhythm in the running routine. It is certain 
also—-this can be verified by trying Friedman’s method on equation (1) 
above—that each method can converge in circumstances where the other 
diverges. 


REFERENCES TO LITERATURE 

Aitken, A. C., 1926. Proc. Roy. Soc. Edin., xlvi, 289-305. 

-, 1931. Proc. Roy. Soc. Edin., Li, 80-90. 

-, 1937. Proc. Roy. Soc. Edin., lvii, 269-304. 

Friedman, B., 1949. Comm. Pure and Appl. Math,, 11, 195 -208. 

Fry, T. C., 1945. Quart. Appl. Math., in, 89. 

Lin, S. N., 1941. Journ. Appl. Math, and Phys., xx, 231-241. (The method 
outlined here seems to have been first used in a thesis submitted for the degree 
of Sc.D. to the M.I.T. in 1939.) 

Steffensen, J. F., 1933. Skand, Akt.-Tidskr., 64-72. 

Whittaker, E. T., and Robinson, G., 1929. The Calculus of Observations , 
2nd Ed., 79-83,106. 


(Issuedseparately May 14, 1951) 




Experiments in Diffraction Microscopy 


108 


XIV.—Experiments In Diffraction Microscopy.* By G. L. Rogers,f 

M.A., Ph.D., Department of Physios, University College, 

Dundee, Angus. Communicated by Professor G. D. Preston. 

(With Two Plates and Nine Text-figures) 

(MS. received October 31, 1050. Read May 7, 1051) 

Synopsis 

Experiments have been performed, using purely optical methods, to verify and 
extend the theory of Gabor's diffraction microscope. An elementary theory of 
the p r o oe ss is first given, from which certain generalizations are provisionally 
drawn. In particular, a focal length is attributed to any Fresnel diffraction pattern 
and the hologram derived from it by photography. The variation of this focal 
length with wavelength and scale factor is postulated by analogy with a zone- 
plate. and the power-rate for a hologram is defined. These deductions are then 
verified by experiment, and a summary is given at the end of 1 10 . Various 
other confirmatory experiments are then described. 

Adequate information is given about apparatus and technique to enable new 
entrants into this field to obtain satisfactory results with the minimum of pre¬ 
liminary trial. 

§ 1. Introductory 

In a number of papers (Gabor, 1948, 1949) Gabor has described a system 
of microscopy in coherent light whereby a magnified but indistinct image 
of an objeot may be obtained without a lens, and a “reconstructed" or 
more distinct image later produced with an auxiliary lens. In particular, 
it is proposed to perform the first stage electronically, with a lensless 
electron microscope, and the seoond stage optioally with a luminous 
souroe and suitable reconstructing lens. 

The present paper describes an experimental study, using optical 
methods throughout, of this very ingenious idea. Though the applica¬ 
tion to electron miorosoopy has not been entirely forgotten, the work 
has been directed to the method in its own right. As a result a number 
of useful generalizations have been discovered empirically and verified 
theoretically which, though doubtless implicit in the many equations 
of Gabor (1949), can also with profit be stated explicitly in simpler 
physical terms. 

$ 2. Elementary Thhoby : Fresnel Diffraction 
( a) Circular Symmetry 

Consider a point souroe of monoohromatie light at O and a s m al l 
scattering objeot at R. In praotioe, a dust partiole or fine droplet serves 

* This paper was assisted in publication by a gradt from the Carnegie Trust 
far the Universities of Boot land. 

t Since appointed Lecturer at Victoria University College, Wellington, New 
Zealand. 

,AM<—wl, i.xm, Af 1M0-51, VAST nr. 


14 



104 


O. L. Bogart, Experiments 


admirably. Let us oonsider the Fresnel diffraction pattern at a plane 
PM due to the direct radiation from 0 and scattered radiation from 
R (fig. 1). We take OR perpendicular to PM, with M on OR. Also 
let OR = a, RM = b, in aooordanoe with the notation generally used in 
Fresnel diffraction. 



The direct ray travels along a path OP. If we take PM = x, we get: 
Oi» - (a + »)• + *.= OJC (l + 

Suppose now i<< OM, then: 


Similarly 


OP = OM tennB ® 


*‘ 0M +t7m~ a+b 


RP^RM 


2(0 + 6)' 
a ? 1 


2. RM 


6 + 


2 . 6 ' 


OM*) 


If now we allow for the fact that the scattered light has to go from O 
to R before it is scattered, and if we assume zero phase-ohange on 
scattering, we get the path from 0 to P via R, vis., 

OR + RP saa o + 6 + 





in Diffraction Microscopy 


195 


The path difference between the two disturbances will then be: 



3* 

2(o + 6) 


It is now convenient to replace the bracket by the parameter 1//according 
to the equation: 


1 11 
6 o + 6 — 


It will be seen at once that this parameter has a physical meaning, which 
experiment showB to be fundamental to the whole process. In short, 
f is numerically equal to the focal length of that divergent lens which, placed 
in the screen position, would image the source in the plane of the object. 

We get now: 


Since the intensity of illumination at P is a maximum when A is an 
even number of half wavelengths, and minimum if A is an odd number 
of half wavelengths, we get that: 



p even max. 


3* 1 A 

2'f~ p '2 i> odd-* min. 


This gives x 1 = fp A as defining a series of rings about M, with maxima 
at even values of p and minima at odd values of p. The regions between 
will have intermediate intensities varying in an approximately sinusoidal 
manner. 

If the scattering partiole, R, introduces a phase shift, the result is an 
opening out or dosing in of the system of rings, corresponding to the 
addition of a constant term to A. If, for example, there is a phase 
delay of \n, corresponding to a path increase of £A, we get: 


and 


A 


3* 1 A 
2 7 + 4 



1 A A A 

2f~ P 2 4 = ? ' 2 


where 





196 


0. L. Rogers , Experiments 


The first minimum occurs at p = 1 or } = } and the first Tnaximam 
oocurs at p => 2 or q = If . Thus the rings defined by q ■= 1 , 2, 3, eto., 
which in the previous case were minima or maxima, now become rings 
of average intensity. Suppose now we replace the sinusoidal variations 
of intensity by abrupt changes of intensity at the positions g ™ 1, 2, 8, 
eto., as the intensity moves through the average, so that / ■=« 0 from 

g = 0-* 1, 2->3, 4->5, eto., and 1 = 1 _when q goes from 1-^2 or 

3-*-4 or B->6. 

Suoh an arrangement, translated from intensities into densities, is the 
well-known zone plate. 

A zone plate is thus a black and white approximation to the Fresnel 
diffraction pattern of a small object scattering the incident light with 
a \it ohange of phase. 

In particular, if x x is the radius of the first ring (where q 1) we 
have: 

*i*=/A, or 

the well-known expression for the focal length of a zone plate. 

Thus it will be seen that / is the focal length of the zone plate, as 
well as the parameter of the diffraction pattern. We note further that 
if 0 is a source of monochromatic light of wavelength A, and R be removed, 
the placing of a zone plate of focal length / in the position of the screen 
will result in the production of an image at the point R of the source O. 
The zone plate may thus be said to “reoonstruot” the object R in the 
oorrect position. 

We shall see in general that if the Fresnel diffraction pattern of an 
object iB photographed or otherwise reproduced, and then placed in the 
original diverging beam, it *'‘reconstructs” the objeot from the point 
source in its original position. (This constitutes the Gabor method of 
diffraction miorosoopy.) But further, if the reproduction of the Fresnel 
diffraction pattern (called by Gabor the “hologram”) is placed elsewhere 
in a diverging beam, it will produce a reconstruction of the objeot in a 
position related to the source position by the lens formula. 

Two other properties of the zone plate deserve attention. First, as 
is well known, the zone plate can act as both a positive and a negative 
lens. That is to say, its focal length is striotly speaking ±/. Thus 
not only does it produoe an image of 0 at R, but also an image of O 
where a convergent lens of focal length / would form an image of 0. 
There are thus two reconstructions of the point R. This has been found 
by Gabor. 

Gabor has shown that the prooess of photographing the diffraction 
pattern to form the hologram results in a loss of information; informa¬ 
tion as to the phase of the disturbance in the plane PM. (See also 



in Diffraction Microscopy 


197 


Bragg, 1950.) The assumption in his method is that if the direct beam 
is of muoh greater intensity than the soattered beam, the phases of the 
resultant disturbance will nowhere differ greatly from that of the direct 
beam. As the latter is the phase supplied in the reconstruction, this 
loss of information is, to the first approximation, unimportant. He 
goes on to show that, as a seoond approximation, this Iobs of information 
results in the production of a secondary image on the side of O remote 
from R: making the approximation that OR«RM, he gets the 
secondary image at R' where R'O = OR. This also holds for a zone 
plate of focal length > > OM. It is clear that the double sign of the 
zone plate focal length is equivalent to the phenomenon of the hologram 
in producing two images. 

The seoond property of the zone plate worthy of study is its possession 
of a whole series of secondary focal lengths, corresponding to higher 



Fio. 2.—Circular acme plate with third order missing from its 
higher order foci 

orders in a diffraction grating. The powers of these lenses go up as the 
odd integers, 1, 3, 5, eto., and may be regarded as higher ordere of the 
fundamental power. It will be seen that the even orders are missing, 
as in the oase of a grating with equal black and white stripes. The 
reason is in both cases the same. In the first order position, the trans¬ 
parent areas in both devices transmit a range of phases from 0 to it, 
2 n to 3 tt, eto., whioh reinforoe. But in all even orders, the range of 
phases transmitted runs from 0 to 2n»r, 4 nn to 6 nzr, etc., and darkness 
results. If a zone plate were constructed to transmit only the first third- 
period zone, blocking the next two-thirds and so forth, it would be found 
to possess all powers bar multiples of 3. We have verified that all 
multiples of 3 are absent from the zone plate in fig. 2. 

The reason for this lies in the fact that the x values correspond to 
integral values of p in ‘J(pfh) and henoe, say, a trebling of p throughout 
1 gives a dmllar get of x values corresponding to a new fooal length 




198 


O. L. Rogers, Experiments 


f *= //3. This property may be peonliar to systems of circular or linear 
symmetry where the x’s recur at regular ▼slues of ^P- 

§ 3. Elementary Theory : Fresnel Diffraction, 

(6) Straight Line Symmetry 

In the case of an object with a straight line symmetry, a so-called 
“cylindrical” wavefront is often taken, though to the degree of approxi¬ 
mation normally used in these discussions similar results occur with 
spherioal wave-fronts impinging on the straight line object. Thus 
Jenkins and White (1937), using cylindrical wavefronts and Airy (1877), 
using spherical wavefronts, both arrive at Fresnel Integrals or the 
equivalent Cornu spiral as the correct method of solving these problems. 

The Fresnel Integrals are tabulated in terms of a dimensionless 
parameter, v (say), and the minima and maxima of a particular problem 
may be obtained from the tables in terms of v. In the case of Cornu’s 
Spiral, however, owing to the fact that we plot dimensionless numbers 
on some particular linear scale, the parameter v acquires the appearance 
of linear dimensions, and is Baid to be the length along the aro of the 
spiral. But this arises from the fact that the soale of the axis is itself a 
dimensional quantity, to which v is striotly proportional; a faot often 
overlooked in the discussion of graphical problems. 

If O (fig. 1) is the source and R the linear object producing a Fresnel 
diffraction pattern in the plane PM, we get the positions of minima and 
maxima in terms of the v values in the tables. These have to be trans¬ 
lated into x values in the PM plane. If OR = a, RM = b, it can be 
shown that for a wavelength A, the relation between x and v is (Jenkins 
and White): 

-yim- 

If, now, we attempt to express this relation in terms of our para¬ 
meter, /, we see at once: 

11 1 _a+b-b_ a 

f b o + 6 6(o + 6) 6(o + 6) 

._^6(o_+6) 

J a 

and in particular: 

-y(m-V(» 

We see, therefore, that the seals of the pattern is, as before, a linear 
function of V( V) where/ is the focal length of a lens, plaoed at the screen, 
to image the source on the object. The soale thus depends on the value 
of / rather than on the individual values of a and 6. 



in Diffraction Microscopy 


109 


| 4. The Lin bar Zone Plate 

It is possible to divide up a cylindrical wave-front into a set of linear 
half-period zones, and thus get a linear zone plate analogous to a 
cylindrical lens. 

There are, in practice, three ways of doing this wave-front division. 
In the first, the wavefront is left open from the oentre of the wave to 



UMU'I WIML 

Fio. 3.—Cornu* spiral for type 1 linear tone plat* 


points where the disturbance is out of phase (fig. 3). This occurs 
when the tangent to the Cornu spiral first becomes vertical, and this 
point ooours when v«l. The next zone, which is blackened, runs from 
here to the next position with a vertical tangent, at a phase angle of 
and a v value of ^3. As the angle ijr of the tangent is always given 
by ^ —i we get zone boundaries at v values given by: 

«>. 1, 8, 5, 7, 9, etc. 










200 


0. L. Rogers, Experiments 


With this system of zone boundaries, we get, for one-half of the zone 
plate, a vector from the origin of the spiral to the point where «= 1, 


which lies at an angle tan -1 

^ 0*7799 


to the x axis, and then a Bet of sub¬ 


stantially horizontal vectore arising from the outer zones. The lengths 
of these vectors can be obtained approximately from the radius of 
curvature formula of the spiral: 


P = 


1 

nv 


For example, the open zone from «* = 3 to c* = 5 can be regarded as 
having an average v* value of 4, giving v — 2 and p = l/ir^/4. The length 
of the 3 -*■& vector will thus be Ip = 2/n\/4. Similarly the length of 
the 7->9 vector will be 2p = 2 /tt n /8. Thus the sum of the outer zones : 

S ° ,rt “ = ^[^ + J8 + /f2 + - ' ' + ](li) + ‘ ’ •] 

or 

Souto = ^[7i + ^ + 73 + • • •]• 

But the above sum is divergent. Thus, in the absence of an obliquity 
factor, these horizontal vectors add up to an indefinitely large amount, 
and ultimately swamp the veotor from the inner zone. The system 
thus gives a maximum on the axis at the chosen point. 

When we examine the possible higher orders of this system, we find 
that the even orders vanish, because the outer zones contribute regions 
whioh are substantially closed circles on the Cornu spiral, leaving the 
initial veotor of length similar to that of the unobstructed path. Odd 
orders will be present, as the open areas then correspond to an odd 
number of $ turns in the spiral giving a divergent series of substantially 
horizontal vectors. 

The second type of zone plate obtainable is that whioh results by 
considering the intersection of the Cornu spiral with the line x=*y. 
This was the system originally chosen for practical experiments, as all 
the vectors lie along x =■■ y in the focal position. Also, it gives a maximum 
value for the initial veotor. The diagram for this type is shown in 
fig. 4. A finite zone-plate of this type is shown in fig. 5. 

In order to get the values we have to solve the equation 
C(v) sa 8(v) where C(v) and S(v) are the Fresnel integrals. This can be 
done by inspection of the tables, but approximate roots oan be readily 
obtained. For this, we assume that Cornu’s spiral always crosses the 
line say at right angles. Although not strictly accurate for the first 
turn or two, the approximation beoomes rapidly better as v increases. 



in Diffraction Microscopy 


201 



J_._L 


COUUI ITIRAL 


Fio. 4,—Corau’a spiral for typo II linear seme pUtu 



Kio. 5.—Type II linear son© plate 


Sinoe the Blope of * = y is + 1 this means we have to find the points on 
Cornu’s spired where the slope is — 1. 

If tan — 1, then rjr = \tt ±nn and we also have ^ =» 
Henoe the solution of the equation is: 

«* = 2 * — $ 

3^, 5^,...» etc. 


or 
















202 


0. L. Rogers, Experiments 


We now have the range 0 to w* = 1} open, close e* ■■ 1J to «* ™ 8} and 
open e* = 3 J to e* = 5 

This latter gives a vector along x = y to add to the first vector. Taking 
4J as the average value of v*, we get its length to be 2p ™ 2 v /2/tt v /0. 

The outer vectors then beoome: 


^r[j9 + Jn^W 


. + 


(8ffi "-j~ 1) 



This, again, is a divergent series indicating a bright line corresponding 
to the first order. 

The high order effeots with this grating differ a little from those of 
the first type. Even orders are missing, as with the other types and for 
the same reason. The outer zones make little or no contribution, being 
virtually dosed circles. 

The first order which produces vectors along the line x — y is the 
fifth order. The third order produoes vectors from the outer zones 
whioh lie substantially at right angles to the line x = y. As this series 
is also divergent, we must suppose that this vector will swamp the 
original veotor, and give a bright line with a phase Jit different from the 
first order. 

The fifth order series is opposed to the initial vector, and must swamp 
it, giving a disturbance it out of phase with the first order disturbance. 
With a finite zone plate, of the right number of terms, the 5th order 
system may give darkness, the finite series from the outer zones being 
equal and opposite to the initial veotor. 

Similarly the 7th order gives a system of vectors perpendicular to 
x = y and j \n out of phase with the initial vector. The 0th order is the 
first where the outer vectors are parallel to the initial veotor and in the 
same sense. 

We have shown that the sum of the outer vectors for the first order 
is 

Similarly, by considering the average radius of the approximately circular 
arcs in the 5th order, we get a sum: 


2V2[1 1 1 

n LV° V 17 * V*® 


+ J(8m+1) 


M 1 


V® + V 1 * + ^m + ' * ‘ + 7 (® w »+ 1 ) + 


• • *|* 


The ratio of these two vectors is 1: In general the nth order will give 

rise to an outer veotor sum of magnitude of that of the first order. 



in Diffraction Microscopy 203 

We thus see (hat the intensities fall off os the order, and become low at 
high orders. 

The third type of zone plate has an initial zone going a little beyond 
the maximum initial vector from the origin to the first intersection with 
x = y. Here we allow the lBt or central zone to go out to the point 



COMM* VIRAL 

Fio. 6.—Gorhu’a spiral for type III linear cone plate 


where the phase is it, i.e., where the Comu’s spiral is next horizontal 
(fig. 6). This occurs, of course, where $jro* = it and gives v* = 2 as the 
edge of the zone. Next blot out the zone from = 2 to c* = 4 and leave 
open v* sb 4 to v* s 6. This latter gives a vertical vector dirooted up¬ 
wards. We thus get a series, with zone edges at t>* = 2, 4, 6, 8 . . . etc. 
Dividing throughout by 2, the ratio of these distances (in t>) is as the 
square roots of the natural numbers, as in the circular zone plates. Once 
again, the aeries is divergent, indicating a swamping of the initial vector. 











204 


0. L. Rogers, Experiments 


Even orders onoe again disappear and odd orders make their appearance, 
but without alternations in phase. 

The use of a linear rone plate with a finite number of terms may 
require speoial consideration. In foot, experimental work has shown 
that with a suitable finite number of zones, black lines can be obtained 
instead of white ones. The detailed consideration of these problems is 
deferred until a later paper. 


§ 5. Apparatus 

In the early work an ordinary Osira lamp was used, as light source, 
with the glass bulb removed, and the whole in a box to proteot the 
operator from U.V. light. An image of the arc was focused by a 10 mm. 
miorosoope objective on to a small hole 10-1 2ft diameter of approximately 
circular outline made by Boy’s method (Strong, 1938). This is not an 
entirely satisfactory procedure, because it is difficult to produce a very 
fine hole, but it does very greatly reduce the number of spurious fringes 
otherwise produced by dust in the system. The main difficulty lay, 
however, in the moderate intrinsic brillianoe of the Osira lamp. 

Later work was therefore carried out with one of the reoently de¬ 
veloped high-pressure mercury lamps, running at 250 watts, and 
developing the greater part of the light in a region between the electrodes 
of 3 to 4 mm. diameter. The plasma in the Osira filled a tube 1 by 3 om. 
This compact source was run in the region of a disk carrying a number 
of pin-holes of various diameters, from 20—>1,000/*, which disk could be 
rotated from outside the lamphouqe to bring each hole into play in turn 
as required. The pin-holes allow a cone of light to pass, and this falls 
on to one side of a microscope objective, either directly or after passing 
through a colour filter mounted on another disk. The distance of hole 
to objective is the “working distance” of the objective, and the side of 
the lamp-house is made to carry the objective. This arrangement forms 
a reduced image of the hole, outside the lamp-house, in a convenient 
position for use (fig. 7). 

The “optical bench” consists of a wooden shelf with two metre stioks 
screwed to it to give an edge against whioh other units could be slid. 
These consisted of (i) an object holder, (ii) a plate holder, (iii) a hologram 
holder, (iv) a 7 in. Aero-Ektar lens, (v) an eyepiece, and occasionally 
(vi) a small lensless “camera”. A night microscope (Rogers, 1948) was 
also available but was not greatly used when the new are was put into 
operation (fig. 8). 

(i) The objeot holder simply consisted of a jig to hold 1 to 4 thick¬ 
nesses of i-in. patent plate glass, on whioh photographio objects were 
mounted. 

(ii) The plate holder was in fact quite a oomplex device whioh was 
designed to allow a printing frame, loaded in the dark-room with a plate, 



»n Diffraction Microscopy 


206 


to be put in an accurately pre-determined position. This consisted of a 
flat board resting on the Bhelf and a vertical front with a hole in it. The 
frame was held against this front with springs. Provision was made for 
the insertion of a filter or compensating plate before the frame, if neces- 



Fio. 7.—Diagram of larnphouae. (1) Choke. (2) Box-type arc. (3) Hole diak. 
(4) Filter disk, (fi) Hote-ohanging knob. (0) Objective. (7) Device for locat¬ 
ing hole disk 



sary. A medrum-foous convex lens, placed behind the frame, was 
focussed on the plane holding the plate. This thus enabled the whole 
carriage to be sited in the correct position, the frame being away, by 
using this lens as a low-powered eyepieoe. 





206 


0. L. Sogers, Experiments 


(iii) The hologram holder was a simplified version of the above, 
without lens, designed to hold a hologram against a vertical surfaoe 
with a hole in it, for oonvenienoe in reconstruction. The nest two 
items (iv) and (v) are self-explanatory. 

(vi) The small lensleas camera consists of a shutter with a droular 
chamber behind. The back surface of this ia a cap, with arrangements 
for holding a plate the size of a postage stamp. It was originally designed 
for work with an experimental Bragg "X-ray Microscope”, but was useful 
for photographing the reconstructed images of holograms, when an 
auxiliary lens was employed. 

For normal purposes a run of 2 metres was sufficient, but for certain 
experiments use was made of a shelf at the same level in an adjacent 
room, giving in all a possible run of 13 metres. 

§ 6. Technique 

The essenoe of diffraction microscopy is the production and photo¬ 
graphy of Fresnel Diffraction patterns. For this purpose monochromatic 
effects are neoessary. The mercury euro gives a number of components 
in the visible, the most important of which are those at 4368, 6461 and 
the doublet at 6770—6701 A. The lines at 4047 and 4078 are, fortu¬ 
nately, comparatively weak, and the two red lines at 6162 and 6232, 
while useable, are not very strong. There is also a strong U.V. line at 
3660 A. whioh can fortunately be suppressed easily by any standard U.V. 
cutting filter. 

To obtain a photograph in monoohromatio light, it is unnecessary to 
filter the radiation until it contains only one component. Any number 
of wavelengths may be allowed to fall on the plate, as long as the plate 
is sensitive to only one of them. Using a well-known series of filters 
with Bharp short-wave outs, in combination with a plate of suitable 
sensitivity, each of these lines can be isolated. 

In practice it is found that the following combinations are effective: 
Ilford "Q” filter to cut the U.V. and partially suppress 4047 and 4078, 
together with an "ordinary” plate will reoord'only 4368. An ordinary 
plate oontains a pure silver-bromide emulsion without any sensitizing 
dye. For recording the 6461 it is convenient to use the Wratten 77 
or 77A, whioh not only outs the blue but suppresses the yellow. If this 
is used with any orthoohromatio plate, the red passed by these filters is 
quite unimportant. The yellow can be isolated with the Wratten 22 
filter. The usual type of orthoohromatio emulsion is a little slow to this 
radiation, but a panchromatic emulsion might reoord some red. Finally, 
the red lines can W isolated by an Ilford Narrow Cut Red or a Wratten 
26, and reoorded on a panchromatio plate. This procedure is alow, and 
waa only used in wavelength dependence tests. Consequently the blue 
and green fines were most frequently used. 



m Diffraction Microscopy 


207 

The objects were normally small scale reproductions from black and 
white line drawings, though one or two attempts to reproduce con¬ 
tinuous tone originals have been made. The normal procedure was to 
remove the shutter from the lensless oamera and replaoe it with a micro¬ 
scope objective screwed into a suitable mount. By successive exposures, 
it was possible to focus this oamera, and when this was done a series of 
objects would be prepared before disturbing the adjustment. Although 
technically the objective should have been used at its “working distance” 
of 160 mm. it was found that quite satisfactory results could be obtained 
at 1 or even 2 metres. Maximum Resolution plates were used through¬ 
out, and the progress of the work was followed under a microscope. The 
objects, when dry, were cemented with D.P.X., a standard microscopical 
mounting agent, on to a J-in. patent plate glass. 

The initial difiraotion patterns were formed by allowing light to 
diverge from the star image of the souroe through the object, and thus 
on to the plate. The effective positions of source and objeot were ascer¬ 
tained by using the Aero-Ektar to form images of each, whioh were 
located by the eyepieoe. The actual positions were then obtained by 
Newton’s formula. At a later stage these observations were correlated 
with the index positions of the holders, and a “zero error” established 
for each. 

Owing to the readiness with whioh quarter plates can be obtained, 
these were taken as a standard plate size. A 2}-in. square aperture at 
one end of the plate received the diffraction pattern, through a hole in 
the mask, and an area 2$ in. x i in. at the other end was subsequently 
exposed to the same radiation through a Kodak No. 1 step wedge. This 
oonsists of a set of densities from 0 to 1*6 in steps of 0'15 approximately. 
It also carries three coloured patches, red, green and blue, originally 
intended for the identification of three-colour separation negatives. To 
modify it to identify the four principal mercury lines, a small patch of 
Wratten 22 was added. This step wedge enables a measure of oontrol 
to be exercised over the processing, and also provides an index of the 
radiation by the use of the patches. 

The plate is loaded into a printing frame, carrying the mask, in the 
dark-room and wrapped in black paper for transfer to the optical benoh. 
It is located in the holder, and exposure is effeoted by removing the 
black paper for the required time. The step wedge is exposed imme¬ 
diately afterwards. 

Development is carried out in an ordinary M.Q. developer to a medium 
contrast, depending on the emulsion used. An estimate of the oontrast 
is made by eye, and the printing technique varied accordingly. Low 
oontrast negatives are printed on prooess plates (by oontaot) and high 
oontrast negatives on “Ilford Ordinary” whioh gives a softer result. 
In oooordanoe with the recommendation of Gabor on over-all contrast of 



208 


O. L. Rogers, Experiments- 


about two is aimed at, but it is not found neoessary to achieve this with 
any very great precision. A value a little less, if anything, is favoured, 
as higher values produce a characteristic effect of “burning out" dark 
lines, to give an artificially light oentre. 

The positive print is mounted with D.P.X. on a J-in. plate, as for 
the objeot, and placed in the hologram holder. The object is removed, 
and the reconstructed image is photographed directly with the plate 
holder, or indirectly with the Areo-Ektar and lensless camera. The 
former is generally preferred. Possible arrangements are shown in 
fig. 9. 


§ 7. Existence of a Focus 

As indicated in the theoretical sections, significance is attached to 
the parameter, /, and experiments were direoted to the elucidation and 
verification of this idea and its oonsequenoes. 

With the H.P. Lamp, the major part of the work was done with star 
images. In particular, a hologram negative (H 138) was taken in blue 
light (4358) of an objeot which was a photographic replioa (by contact 
printing) of a micrometer eyepieoe scale 1 om. long divided into 100 
parts. 

A positive print of this (H 140) was mounted and subjected to careful 
study in Hg green light. In particular the source-hologram distanoe 
was varied and the image located. There are two ways of doing this, 
both using an auxiliary lens. In the first plaee the auxiliary lens is used 
to form a real image of both source and subsidiary images. These 
may be located by an eyepiece. The method of no-parallax is not 
available since all the light passes through a single point region, and no 
differences of viewpoint are available. Location must thus depend on 
an impression of sharpness by no means easy to ascertain with a highly 
coherent illuminating system. It was thus felt that a single eyepieoe 
determination would be of doubtful value, and in praotioe a series of at 
least six observations, with the auxiliary lens moved between, were 
taken as a single "location run” and the image position calculated from 
each observation. The mean of these observations was taken as the 
position of the image. 

This procedure, though tolerably accurate, was tedious and fatiguing. 
The following alternative was developed. Hie auxiliary lens was used 
to produoe a real image of source and reconstructed images, and also of 
the objeot, left in the path between the souroe and hologram. Normally 
either objeot or hologram is inverted to avoid overlapping. If, now, the 
objeot is moved back and forth, it may be brought into ooinddenoe with 
one (but only one) of the reconstructed images, as judged by the fact 
that objeot and reconstructed image are equally sharp as seen in the 



in Diffraction Microscopy 209 

eyepieoe. A photograph of this composite field is Bhown in fig. 10 f 
PL n. 

In this way it was only necessary to carry out one looation run, at 
the beginning of the series, and get all other positions by observing the 




Fio. D.—Somo suggested system*. -object,-hologram.reuon- 

struoted image, (a) Direct divergence. (6) Indirect divergence, (o) Parallel 
light, (d) Convergent light, (a) Gabor’s original reconstructing system. (/) Re¬ 
construction in psralleTlight. (g) Lonnies* reconstruction in divergent light. 

(A) Set-up for hologrsm of s hologrsm (H 157/0) 

shifts of arbitrary indices fixed to the objeot and hologram oarriagee. 
At the end of the operation, a new looation ran was performed as a 
check. 

Furthermore, sinoe the objeot was a scale, and the hologram was 

P .fcM .—VOL. LOT, A, 1000-Sl, PAST m. 16 





210 


O. L. Sogers , Experiments 


produced from it and gives a scale on reoonstruotion, the effective over-all 
ma gnifi cation of the operation can be determined at eaoh stage by the 
direct comparison of the two soales, real and reconstructed, a process 
which gives valuable additional information as to the nature of the 
operation. 

In this operation a slight difficulty arises in locating object and 
hologram from the fact that they are mounted between J-in. glass plates, 
and hence their optical position differs from their physical position. 
The method chosen is to locate them using the auxiliary lens, and to 
oaloulate their distance from a known focal point of this lens, using 
Newton’s formula, on the assumption that the medium is air. This 
automatically gives their effective optical position. There is, however, 
some Blight suggestion that the effective optical position varies slightly 
with the angle of the cone accepted by the lens. 

Table I gives the index readings and looated positions of source, 
scale and hologram at the beginning and end of the run. 

Tabu I 


Sourco 

Located 


8oale Hologram 

t - t --% 

index Located index Looated 


Zero Erroi* 

,-*- 1 

dcalo Hologram 


Beg inning 06*75 cm. 
End 06*64 


04*0 04*10 87-88 

01*0 91*07 79*86 


85*34 *10 - 2*64 

77*08 *07 —2*77 


It is not altogether clear whether the change is due to errors in loca¬ 
tion, or whether they arise from a slight drift due to ohange in the 
effective optioal position. Calculations have been made on both bases, 
those on the second assumption being given in Table II. 

Positions are estimated to 1/10 mm., even when looation is less 
accurate, to ensure that errors arising from the use of rounded figures in 
calculation shall be smaller than those due to the observations. It 
will be seen that both the focal length (or the power 1 If) and the expres¬ 
sion —v), are sensibly constant. Experience shows that errors 

of 2 mm. in setting the hologram can easily occur, and henoe an aoouraoy 
better than ± 5 per cent, oannot be guaranteed. 

The constancy of the power was expected on theoretical grounds. 
The significance of the expression Muftu — v) now requires exploration. 
Gabor has shown that, in the production of the hologram, the coarse 
structure, which is reproduced directly like a shadow, is magnified by 
the usual projeotive law. A similar law is expeoted to hold for the 
reoonstruotion of the image. Sinoe the coarse structure must act as a 
speoies of “framework", it is legitimate to expect that the fine structure, 
when ultimately reconstituted, will also be found to obey the ■*««« law. 
This has been verified' by experiment. 



in Diffraction Microscopy 


211 


The over-all magnification M of the whole process, as observed by 
direct comparison of the two scales, is recorded in the 11th column. 
Now the projective magnification of the 2nd or reconstructive stage will 
be less than unity, and is given by (u —v)/u in our notation. The 
reconstructed soale is smaller than the ooane framework recorded on 
the hologram. Sinoe we know the observed over-all magnification, and 
also the theoretical stage 2 magnification (column 13), it is legitimate to 
divide the first by the seoond to get the effective magnification at the 
first stage. That is, the magnification between the original objeot and 
the ooarse framework on the hologram. Since this is fixed once and for 
all by the initial exposure conditions, we anticipate that Muj{u —«), 
its measure, will also be constant, and the oonstanoy of this expression 
constitutes the verification of the theory. 

It is also found on reference to the conditions under whioh H 138 
was taken, that the source was at 95'84, the soale at 03*04, and the plate 
at 80-0 0 . From this we deduce a projective magnification of 
15*84 

— - = 5*05 in excellent agreement with the value of Table II (last 
2'79g 

column). In faot, allowing for the probable errors of looation, a variation 
of 5 per oent. would not have been exoessive. 

Now we note that H 138 was taken in blue light, but the reconstruc¬ 
tions of Table II were made in green light. We see from this that the 
ooarse structure is magnified in a purely projective way and the resultant 
magnification can be calculated on this basis from source, objeot, holo¬ 
gram, and reconstructed image positions without direct reference to the 
wavelengths involved. Of course, the wavelength used has an indirect 
influence on the position of the reconstructed image, by its influence on 
the effective hologram focal length. Moreover, we are here oonoemed 
with the linear magnification: we are not concerned with the resolving 
power and some of the magnification may be “empty”. The resolving 
power is very closely related to the wavelength. 

The data for the oonditions under whioh H 138 were taken lead to a 
calculated hologram power of 1*36 dioptres in blue light. This, of 
course, varies with wavelength, and so in green light we expeot a power 
5461 

of 1*36 x = 1*704 dioptres. The errors in the positions of source 
4358 

and soale as looated prior to taking U 138 could produce a variation of 
± 5 per oent. in the power, whioh is sufficient to bridge the gap between 
this and the value in Table II (Column 10). 

} 8. Wavblbngth Variations 

The above results, while not inconsistent with the theory that the 
power varies directly as the wavelength, cannot be regarded as a partiou- 



212 


0. L. Rogers, Experiments 


larly convincing demonstration of the faot. To investigate this pheno¬ 
menon, the following experiment was performed. 

Two holograms were to be taken with the same souroe, object and 
plate positions, bat in different wavelengths. Bromide enlargements 
were to be made from each negative under fixed conditions of magnifica¬ 
tion. These enlargements were then to be out along a line, and a “top” 
of one mounted above a “bottom” of the other. The enlargements 
were then photographed down to approximately the original size and a 
positive composite hologram was prepared from them. 

Two attempts have been made along these lines. The first was an 
early attempt with the OBira. Only two wavelengths with this were 
sufficiently intense for use in this experiment, the blue and the green. 
They do not have a very large ratio (1*25:1) and henoe were not very 
promising. H 63 was given an hour’s exposure to green light (being an 
orthoohromatio plate), while H 65 was the same objeot, in blue light 
using a fast ordinary plate for about 10 minutes. The composite holo¬ 
gram produoed from this was called H 68, and was studied intensively in 
green light. The upper half of the hologram was from H 65, and the 
lower from H 63, the objeot being a reduoed copy of a newspaper outting 
with a chess problem. 

The second experiment was done with the H.P. arc. The set-up was 
maintained accurately by damping down the new objeot holder and 
plate holder to the bench. A series of photos was taken in blue and in 
red light, thereby getting a more favourable wavelength ratio of ~ 1*42. 
Two were selected as being of oomparable density and contrast, as 
judged from the hologram and step wedge respectively, so as to get 
bromide enlargements more closely matched them before. In the end 
H 186 and H 187 were ohosen. 

H 186 was 1 hour’s exposure in red light to a fast panohromatio plate 
(P 1500), the object being a miorosoope eyepieoe scale. H 187 was 
2 minutes’ exposure on a slow ordinary plate (Ilford Ordinary). After 
enlarging, the holograms were trimmed to show the scales and a small 
surround, and were mounted with the soalee opposite one another, one 
being inverted for the purpose. An enlargement to the same degree 
was made of the step wedges, whioh were carefully matohed and included 
on the hologram to enable the over-all contrast to be oheoked. On 
re-photographing the resultant positive was numbered H193. In this 
case great care was taken to get the final hologram the same size as the 
originals, the point being oheoked on the focusing screen with one of the 
original negatives. This is to avoid any complications with the scale 
factor. 

A run was not performed with H193, but instead observations were 
made in green, light rendered parallel by the auxiliary lens. In this 
oase the focal length can be determined directly as the distanoe the eye- 



in Diffraction Microscopy 


213 


jTliSifggllll 3 

~ 3 « O O lO >0 £ 10 «0 *0 >0 


dSSa8S2f8SS5 

3 


rfl sssssssssss I 

f G H a os © O ^ ^ <N « w « eo 5 

» £ * ^ 


^588553 ^ S S 2 8 S SJ 


a«NP30!9NMQHQ A 
^ »aor-t-aooo®t^t-r-r- i» 

p» 

® t» w « r*» to a co ® ® go *ij 
-ii. ^toaonMQflooi 
^ NOAAA00t*r't'hh 


^ eoo5*3^55fri«ooio« 
daceaot^i>ce<bw*oio»o 


- «aooonMNnneo 


~ U a 38SS8S3S5228 

Is OH«n«4»»^#oo 


fj sasssssssss 

P S 3iSflooo®M^^r^h. 

g>< ^ 

§ | 3S83SS888S2 

i 3 5SS3382S38S 


|28S288®22«o 
|233SSS33333 

8 ah«H«««a»no 

1 23883333353 


82385385833 

828888M333 


| s32S88SS ^33SS3S 

2k ^ « N ^ m M N hm^hh«i 


£958383 

t’ «ifl io 4 4 


^ OS Oi 00 GO l> 


. |* 558328 ?% 888338 


3 SiT«io<bobo 

• - hhhnw 

•5 4 * 


OOOOhhN 


^ « to IQ <V C4 -I OOIOQflOt«t> 


0 i* 888388 |= 33SSSS 
g >9/258888 *85888 


328388 


4 c* oo nt> * © 
■* doti «b w © 


2SSS33 

^ «*«»'« 140 


258293 

ao r* « ce © 


238533 

HWli^MO© 

































214 


Q. L. Sogers, Experiments 


pieoe moves between focusing on the hologram and on the reconstructed 
image. A number of settings were made, which showed variations of 
± 3 mm. for the shorter foous, and ± 6 mm. for the longer focus. Both 
are thus within ± 2$ per cent. The focal lengths were 20*31 cm. and 
28-01 cm., corresponding to powers of 4-93 and 3*46 dioptres respectively. 
The ratio of these powers is 1*426, in excellent agreement with the ratio 
of 1*42 of the wavelengths. 

A further check is obtained from the taking set-up. The spacing of 
source, scale, and plate leads to a power of 3*87 dioptres; this being 
3*87 dioptres in blue for H 187 and in red for H 186. When these powers 
are converted from blue and red into green they come out at 4*84 a and 
3*41 respectively, which gives good agreement with the observed powers. 

It seems fair to accept a direct relation between power and wave¬ 
length, as indicated by the theory. This being so, we can go a stage 
further, and get a more fundamental constant of the hologram. Hitherto 
we have specified the power of a given hologram in a given wavelength, 
but since these are proportional to one another, we can now specify the 
power per unit wavelength. In order to get numerically convenient 
quantities from ordinary holograms it is suggested that the power per 
unit wavelength be normally specified in dioptres per mioron. We get 
for H 187, then, a power per unit wavelength of 8*88 dioptres/micron 
and for H 186 of 6*25 dioptres/mioron. It will be observed that this 
unit has the dimensions Lr 1 and a magnitude of l/sq. mm. It is called 
the power-rate. 

For comparison it might be pointed out that for an electron system 
with a source-object distance of 0*1 mm. and a source-plate distance of 
1 metre, using electrons of 0*06 A. wavelength, the power-rate will be 
about 20 dioptres/mioron. Such a hologram will thus be comparable 
with the optical ones here considered. 

§ 0. The Scale Factor 

It was early appreciated that the scale on which the hologram positive 
actually used is reproduced from the negative as originally taken, would 
be important and would be worth study as throwing light on the theory 
of the process. While, therefore, contact printing was normally used 
for reproduction, an enlarger was also employed in some oases. 

In the first place a couple of pairs of holograms were prepared: each 
containing (a) a oontaot print, and (6) a reduction in the enlarger. The 
latter was preferred to an enlargement, because it gave a hologram 
easier to handle. This arises from the fact that such a hologram has 
an increased power-rate. 

In one case the pair was examined in parallel light, and the focal 
lengths determined as for the composite hologram, H 193. We got, for 



in Diffraction Microscopy 215 

the normal size, / = 21*47 ± *50 om. and for the reduced hologram 
/= 11*35 ± *34 om. This leads to a ratio of 1*89 ±*11. Careful 
measurements on these holograms give the linear ratio as 1*351 ± *01. 
This corresponds to a (ratio) 1 of 1*8262 ± *03. It will be soen that the 
law / x L* holds within the limits of experimental error. 

The other pair was measured without an auxiliary lens, by u and v 
measurements over a considerable length of bench. The powers obtained 
were: Normal 0*55 dioptre, reduoed 1*46 dioptre. Ratio 2*67. Linear 
reduction 1*573, which squared gives 2*48. 

As a further check, one particular hologram was enlarged. This 
was H 175, taken in parallel light, with the unusually high power-rate of 
17*3 dioptres/micron. This is the only hologram which we felt strong 
enough to stand “dilution”. It was subjected to a linear magnification 
of 2*70 ± *07 and focal lengths were obtained as follows (in Hg green). 
Normal 10*01 ± *2 om. Enlarged 78*92^1 cm. Ratio 7*44 ±*24. 
Square of linear magnification 7*02 ± *35. Here again, the agreement 
is satisfactory, and the fact that the linear ratio here departs substan¬ 
tially from unity is an additional check on the theory. 

§ 10. The Two Images 

It is an essential of the theory that both zone plate and hologram 
produoe two images. So far we have only examined one image at a 
time. The experiments first described located the image produced by 
H 140 acting as a divergent lens. Some of the later experiments, deter¬ 
mining the focal length direotly, use the hologram as a converging lenB. It 
remains to show that the converging and diverging powers are the same. 

Reference may here be made to some work with the early composite 
hologram H08 in green light. The work was all done with the old 
Osira and was thus not carried out under the most favourable conditions. 
The method of looating an image with a moveable objeot had not been 
developed (and would not have located the second image). So all data 
were obtained from “Location runs”. This makes the analysis very 
tedious, and the work has not been repeated. 

Measurements were made to looate (i) the upper image sharp, (ii) the 
lower image sharp, (iii) the source, (iv) the “symmetrical” lower image 
sharp, and (v) the “symmetrical” upper image sharp. The hologram 
itself was located separately. 

It at once became apparent that the “symmetrical” images were not 
symmetrical. Gabor’s deduction of symmetrical images rests on the 
assumption that the image source distanoe is very small compared with 
the souroe-hologram distanoe, ».e., that / is very large. In this oase, 
indeed, the lens formula does give symmetrical images. 

The results are summarized in Table HI. 



210 


0. L. Roger*, Experiment* 


An examination of Table III shows at onoe that the results obtained 
from the upper part of the hologram {V l and C7 a ) are as satisfactory as 
can be desired. This part of the hologram contained resolvable writing, 
and was processed to very nearly the ideal contrast. Hence this image 
was easy to looate. The lower part, L 1 and L t does not give the same 
Bort of agreement. Here the hologram contains no resolvable writing, 
and is of lower contrast and hence the locations were more difficult. It 
is not felt that the divergence is beyond the limits of experimental 
error. 

Working from the estimated power on taking, and assuming the zone- 
plate law, we get theoretical powers of 1*67 and 2*01 dioptres for the two 
oases. But owing to the practical difficulties of being sure of the oorrect 
positions before the taking device was made, these figures must be taken 
as indicating an order of magnitude only. 

We see, therefore, that the theoretical expressions derived from the 
simple theory of the zone plate are found experimentally to hold for 
holograms in general, within the limits of experimental error. We 
therefore feel justified in concluding that the hologram is a generalized 
zone plate and in particular we associate it with the following properties: 

(1) A given Fresnel diffraction pattern, produced in coherent mono- 
ohromatio light of wavelength A proceeding through a point may be 
associated with a focal length/, being the focal length of that lens which, 
placed in the plane of the pattern, images the souroe (or virtual source) 
in the plane of the objeot producing the pattern. 

(2) If a photographic reproduction of the pattern is now placed in a 
ooherent beam of light of wavelength A it will form two “images" of the 
source, resembling the object, as though it were a lens of focus ±/. 

(3) If it be placed in a beam of wavelength A' it will aot as if it had 
a new focal length ±/' where /'A'=/A = a constant of the pattern. 
In particular 1//A is oalled the power-rate. 

(4) If the soale of the pattern be altered by a linear factor L, it will 
be found that the constant factor /A is altered by a factor I?. 

(5) The effective magnification between the original objeot and its 
reconstruction can be calculated by purely projective considerations, 
from the original source through the objeot to the hologram, and then 
from the hologram through the reconstruction to the reproducing 
souroe. 

The projective law follows from the “ooarae structure" relation, and 
means that the hologram of an objeot (as against a point or line) must 
be described by two parameters. One is /: this gives the size of the 
fringe structure round any particular point. The second is the projective 
magnification, this determines the separation between the centre* of 
these fringe systems in terms of the separation of the two original points. 



in Diffraction Microscopy 


217 


This also explains why “high order” images are not normally observed. 
If we reduoe f to //3 we will normally reduce the fine structure in one 
ratio and the coarse in another: one varies with / and the other with the 
projective magnification. 

§ 11. Multiple Operations 

When it has been proved that an o]>eration can be performed once, 
it is frequently instructive to perform it twice. Accordingly H 140 
was taken and put about a metre (103 cm.) from the Bource in green 
light. Under these conditions it would produce two images, a real 
image about 123 cm. further away in a direction opposite to the source, 
and a virtual image 36 cm. on the source side of the hologram. A plate 
was put up 46 cm. on the more distant side of the hologram, bo that the 
light passed through the hologram and fell on the plate before form ing 
the real image. The plate thus records, in effect, two images, one being 
81 cm. one side of it, and the other (unformed) 78 om. on the other side 
of it. 

The hologram, H 169, formed from this —ve has two effective focAl 
lengths, and ±/ t , and produces four images from a point source. 
It is shown in Fig. 16, PI. II. 

Consideration of the taking arrangement suggests that these two foci 
should be ± 61-2 om. and ± 180-3 om., but that neither is likely to be 
estimated to better than ± 6 per cent., if that. Direot measurement of 
the foci gives 47-1 and 168-6 cm. respectively, again to ± 6 per oent. or 
a little better. In view of the fact that none of these images is as easy 
to see as is the case with a simple hologram, this agreement may bo 
regarded as satisfactory. 

Now one method of reconstruction is virtually this: the hologram 
is put, say, 100 om. from the source and forms a real image at a greater 
distance, say 223 cm., as above. If a plate be placed here, it will record 
the image and this constitutes a useful and legitimate method of recon¬ 
struction. But this reconstruction is itself a hologram of a hologram 
and should have two focal lengths. One will be zero: and as + 0 is 
the same as — 0 we have here a case of coincident roots. The hologram 
and image beoome the same thing. But there will also be a subsidiary 
pattern from the virtual image between the source and the original 
hologram. Gabor reoognizes this when he says that the unwanted image 
interferes to a slight extent with the wanted image. But the unwanted 
image does more than that; it records on the reconstruction, albeit 
diffusely, ail the information required to reconstruct it. 

We took one of our reconstruction negatives (R7) (fig. 12, PI. I) and 
printed it to form a hologram H 162. This hologram was examined in 
parallel light, and as well as having the trivial focal length ± 0 (giving a 



218 


O. L. Rogers, Experiments 


sharp image in its own plane) it was found to have a focal length of 
141*0 cm. within the usual limits. The anticipated focal length from 
the details of R7’s taking was 145*5 cm., in satisfactory agreement. 

We also made a reconstruction (R 11) (fig. 14, PI. I) from this holo¬ 
gram to provide visual proof of the existence of the secondary Image 
associated with the original reconstruction R 7. 

Of course, this can in theory be done indefinitely. R 11 is a third- 
order hologram, and in theory might give rise to eight images. In 
practice owing to the existence of several coincident roots, there will be 
five, one of which coincides with its own plane. No other third-order 
holograms have yet been produced, but it might be instructive to try 
and produce one from H 150 in such a way as to produce coincident 
roots in a place other than in its plane. This might give an extra dean 
image, or even give rise to “beating” effects in the plane of the third- 
order hologram.* 


§ 12. Misoeli.aneoits Experiments 

It is inevitable in an investigation of this kind that a number of side 
experiments are set up to explore minor ramifications of the process. 
A number of these are worth putting on record. 

(а) An early experiment was made with two objects so sandwiohed 
in glass as to lie in different planes, and thus at different distances from 
souroe and hologram. The hologram so produoed, reconstructs the two 
objects in different planes, with such overlapping of the out-of-foous 
outlines as would occur in direct viewing through on equivalent system. 
The phenomenon can be explained by noting that each contributes to 
the net effect a diffraction pattern with a characteristic (but different) 
fooal length, and the positions of the reconstructed images can be 
obtained from this. 

(б) A similar effect can be obtained by taking two holograms of 
different objects with different power-rates. These are then double- 
printed on to a single plate, i.e., a given plate is exposed an appropriate 
time behind each negative in turn and then developed. The diffraction 
pattern which results differs in detail from that of ease (a) but the visual 
effeots which result are very similar. Once again the two images can 
be partially separated by their different focal lengths. 

(c) In one ease a copy negative was made from a positive (with an 
over-all y from the original pattern of ~2) and used as a hologram. 
This has the same fooal length as the positive, and produces a recon¬ 
structed image which is a negative. R 7 and R 8 (figs. 12 and 13, PI. I) 
are reconstructions from a positive and a negative hologram (poaitive 

• Note added S.9.S1. This has since been identified as a reconstruction of 
the first order hologram, 



in Diffraction Microscopy 


219 


hologram shown as fig. 11, PI. I) under identical conditions, and they are 
mutually super-posable to give a sensibly uniform density. The exact 
theory behind this is not known, as it was not expected from Gabor’s 
analysis, but the exactitude with which it works is impressive. If the 
initial negative can be developed to a y of 2 (and this will only be possible 
with certain emulsions) there is a possibility here of economy in run nin g 
costs. For a plate or paper set up in the reconstruction plane now gives 
a direct + ve. As the plates used here are not all suitable for high y 
working (and this has other disadvantages) the method has not been 
much used, but some of the electron-sensitive emulsions will be more 
amenable to this type of treatment. 

(d) It follows from (6) and (c) that holograms can be subtracted. 
In particular, it should be possible to subtract a hologram of the back¬ 
ground from that of background -|- wanted image, to obtain the latter 
alone. The successful accomplishment of this would require a carefully 
adjusted “dummy” object (plain glass of equivalent thickness) and hence 
it has not been attempted. 

(e) A relief image in plain (transparent) gelatine has been prepared 
by the Carbro process from a hologram (H 135) of the scale. By itself, 
this relief image introduces excessive phase distortions into a coherent 
beam. But if a covering plate is mounted over it using D.P.X. cement, 
there remains just sufficient difference of refractive index between the 
gelatine and the D.P.X. to produce a delicately graded phase-contrast 
object. This phase contrast hologram reconstructs as before. The 
analogy is with a zone-plate in biohromated gelatine with a JA retardation 
in the gelatine areas. As is well-known, this zone-plate acts like a normal 
one, but with greater light-gathering power. 

(/) A bromide enlargement was made from a hologram negative, and 
the dark fringes were inked over with Indian Ink. The bromide imago 
was then removed with Farmer’s reducer to leave the ink drawing on an 
otherwise white ground. This was photographed down to form a holo¬ 
gram (called the black -and -white hologram). The subject was the 
microscope eyepiece scale, and the black and white hologram, though 
only containing three or four fringes, gave a crude bat recognizable 
reconstruction. The method is chiefly advantageous in allowing back¬ 
ground dirt to be ignored. A fuller description, with photographs, is 
given in Rogers (1950). 

(g) An attempt was made to produoe a composite hologram con¬ 
taining two side-by-side images in dye, produoed by dye-coupling de¬ 
velopers. The idea was to use the whole radiation to illuminate the 
system and get one side of the hologram absorbing only one-wavelength, 
the other only another, and thus effecting reconstructions in two different 
wavelengths in adjaoent positions. But the dyes used did not interact 
in the way hoped, and tended to absorb too wide a wavelength hand. 



220 


0. L. Rogers, Experiments 


(A) An attempt was made to produoe a hologram in a beam of parallel 
light and this was successful. The only conditions are that the object 
shall be small compared with the diameter of the beam of parallel light, 
and that the hologram shall be reoorded reasonably near to the objeot, 
so that the diffraction pattern of the object Btill falls wholly within the 
area of plate illuminated by the parallel beam. The object-plate 
distance is automatically the focal length of the hologram, and this is 
how the short-focus (high power) hologram mentioned in §0, was 
obtained. 

(i) It is worth placing on reoord the fact that we have, in effect, 
produced a hologram in convergent light. About four years ago, while 
doing some experiments with the honours class a number of Fresnel 
diffraction patterns were produced in convergent light. The light from 
a small source (Hg Arc with pin-hole) was made slightly convergent 
with a telescope objective. It was found in this way that (i) a slightly 
larger pattern oould be condensed on to a slightly smaller photographic 
plate, and that (ii) a concentration of light was produced whioh gave a 
welcome reduction of exposure.* 

§ 13. Methods of Reconstruction 

It iB convenient here to summarize the methods of producing a 
reconstruction from a hologram, (i) There is the original Gabor method 
of projecting the small reconstructed image near the original object 
position, on to a plate by the use of an auxiliary lens, (ii) The light 
from the point source may be rendered parallel by an auxiliary lens, and 
the image observed or photographed in the focal plane of the hologram, 
(iii) The auxiliary lens may be dispensed with, and the divergent beam 
passed through the hologram, whioh must be further from the source 
than its own focal length, and the resultant real image observed or 
photographed. 

Methods (ii) and (iii) are recommended, as avoiding very fine-grain 
techniques in the photography. Method (ii) conserves the light well over 
long distances, and hence exposures are kept low. On the other hand 
the last method allows further projective magnification to be obtained, 
at some cost in exposure time, if desired. 

We understand that Gabor (Gabor, 1950) has discovered method (ii) 
independently. 

§ 14. Abtotoial Holograms 

Some work has been done on artificial (calculated) holograms, includ¬ 
ing linear zone-plates, but a discussion of these experiments is reserved 
for a further paper. 

* Jiote added 8.9.6J. This has gince been confirmed by reconstruction, 



Proc, Roy, Soc . Edin., A\ 


Vol. LX1 LI 



Kiq. 12.—H 7, a roconatruction from a positive of U 140 



Vk*. 14. R 11, a rwonstnirtion from a positive of R 7 
(a third order hologram) 

G. L. Roushs | Plate T 







. Proc . Roy. Soc. EdinA\ 



Via. 10. -’Coinpoaite imam* consisting of a roconstniutiun (above) alongside an 
image of t.ho object (below), located in the plane of reconstruction, ah used for 
the work of para. 7. using n (labor system (flg. Ite) 



Kru. 15.—U 100, a hologram of a hologram 


* 

G. L. Rogkrh 


| Plate II 







in Diffraction Microscopy 


221 


§ 15. Acknowledgments 

My thanks are dne to Professor G. D. Preston for his current help and 
enoouragement; To Professor Gabor for ooming to Dundee and giving 
a lecture which sowed the seeds of many of these ideas, including the 
zone-plate analogy in its simplest form, and for periodic enoouragement 
since; to Messrs. Pilkington Bros, for a supply of their best J-in. plate 
glass, without whioh the work would have been prohibitively expensive; 
to Professor A. C. Lendrum and the D.R.I. for supplies of D.P.X.; to 
Mr. A. Gunn for advioe on the Carbro prooess; to Dr. C. Waller, of 
Ilford Ltd., for preparing me a batch of specially sensitized Ilford H.R. 
plates; to the workshop staff and especially J. and W. Stark for their 
patienoe with my mechanically exacting demands. 


REFERENCES TO LITERATURE 

Airy, 1877. Undulatory Theory of Optics, Macmillan, 1877, pp. 60-68. 

Bragg, 1960. Nature, olxvi, 399. 

Oabob, 1948. Ibid., olxi, 777. 

-, 1949. Proc. Roy. Soc. (A), cxovir, 464. 

•-, 1960. Private Communication. 

Jkxxhts and White, 1937. Fundamentals of Physical Optics, MoUraw Hill, 
pp. 184-201. 

Rogers, 1948. Joum. Soi. Instru., xxv, 284. 

-, 1960. Nature, clxvi, 1,027. 

Strong, 1938. Modem Physical Laboratory Practice, Blaokie, p. 69, footnote. 


IIssued separately February 9, 1962) 


222 


A. J. and 8. 8. Macintyre, Theorems on Ae 


XV.—Theorems on the Convergence and Asymptotic Validity 
of Abel's Series.* By A. J. Macintyre and Sheila Scott 
Macintyre, University of Aberdeen 

(MU. roceivod March 3, 1051. Howl November A, 1931) 

Synopsis 

In this pafier we (luMiuee the Abel series for a function F(z) which is regular in 
an anglo | arg * | < a and at the origin. We investigate conditions under whiah 
the series converges and conditions under which its sum is asymptotically equivalent 
to the function F(z) in the half-plane B(z) > 0. 

1. Introduction 

Associated with n function F(x) defined and infinitely differentiable 
for x ^ 0 is its Abel series (Abel, 1830) 

£ z(z — w)*" 1 j* n) (n)ln\ . . . (1) 

n-0 

If this series converges for a single value of z ^ 0, it converges uniformly 
for each bounded region of the z-plane and its sum is an integral funotion 
of exponential type (Halphen, 1881). Abel was possibly not aware of 
this property as he applied the series without comment in a note whioh 
was published posthumously (Abel, 1830, page 82) to the funotion 
log (1 + 2 ). Halphen investigated the series for log (1 4- z) in consider¬ 
able detail (Halphen, 1881) with the aim of showing that it could not 
represent log (1 + 2 ). Much subsequent work has been devoted to the 
convergence and summability of the Abel series to the funotion F(z) 
when F(z) is an integral function satisfying oortain conditions (Oont- 
oharoff, 1035; Gelfond, 1938; Buck, 1048). 

The present paper supposes F(z) to be regular in an angle | arg z | ^ a 
and investigates conditions under whioh its Abel series converges. The 
sura of the series is not generally equal to F(z), but we show that under 
slightly more stringent conditions it has on asymptotio validity in the 
sense that the difference between F(z) and the series tends to zero as 
R(z) tends to infinity by positive values. Our principal results are 

Theorem 1: If for | arg z | < fw, F(z) is regular and satisfies 

\F(re it ) | <Xr-*e ,6<, > ... (2) 

* This paper was assisted in publication by a grant from the Carnegie Trust 
for the Universities of Scotland. 



Convergence and Asymptotic Validity of Abel's Series 


223 


for some positive K and e then the Abel series for F(z) converges uniformly 
for z in any bounded domain. 

and 

, t Theorem 3 : If for | arg z | < §n and R(z) — h (h > 0), F(z) is regular 
and satisfies 

| *’(re«)| </rr-r e rt«) ... (3) 

for some K > 0 and y > 1 then representing the mm of its Abel aeries by 
A(z) 

I^^C)l=»U,i.) ... (4) 

as x = R(z) tends to infinity by positive values. 

In (he statement of these theorems, b{0) is a certain continuous 
positive function of 6 with period 2 v. If F(z) is supposed to be an 
integral function and to satisfy (3) for all 6 then it is a known theorem 
(Sohmidli, 1942) that the series converges to F(z). The proof of Theorem 
1 is fairly direct, but before proceeding to the proof of Theorem 3 wo 
have to develop some aspects of the theory of the Laplace transform of 
functions regular in an angle. 

Somewhat more general theorems (Theorems 2 and 4) are also proved. 
The conclusions of Theorems 1 and 3 continue to hold if certain more 
drastic inequalities are assumed in a smaller angle. 

2. Conveboknow Conditions 
It is known (Polya and Szego, 1945, HI, §§ 116, 265) that 

1- S i”<c r «« .... (5) 

n 

for all z — re w where b(0) is the supporting function of tho bounded 
domain B which is represented by 

Z = we > 1 v . . . . (6) 

on \Z \ < 1. B is a convex pear-shaped domain containing the origin 
whose boundary has a vertex at w — — 1 where the two tangents 
make angles ± Jw with the positive direction of the real axis. If we 
put z — nJ and w=(J — l) -1 then C w is represented in the J -plane by 
a curve Cj and in the z-plano by a curve G t whose shape is independent 
of n. Equality is attained in (o) when z lies on C t . 0, has a cusp at 
the origin and its tangents there are the radii arg z = ± f n. C 3 lies 

entirely within the sector | arg z \ < except for z = 0 and is con¬ 
tinuous and rectifiable. 



224 


A. J. and 8. 8. Macintyrt, Theorems on the 


If we now uso Cauohy’s formula for F* n, (n) taking the contour as G„ 
the proof of theorem 1 is immediate. For 


F(z)dz 

»+i 


H*Hn) = --- f 

K} 2 mJC,{z-n) 

and from (2) it follows that (with r = |»|) 

Jo.n-V 


(V 


f-t gTW | dz | 

e mm + vn) 


<K'n 1 


( 8 ) 


if we take (as we evidently may, without loss of generality) e < 1, to 
ensure convergence near z = 0. From this, the modulus of the »“* 
term of Abel’s series satisfies 


z(z — »)" 
n! 


F^ H \n) 




... (9) 

This proves theorem 1. 

We obtain a generalization of theorem 1 by taking the contour Cf 
consisting of the radii arg z = ± a(0 < a < in) from the origin to the 
points where they meet C, and the aro of C t for which | arg z | ^ a, in 
place of C, in (7). Let < a < then for arg z ~ ct 


|*| 1 °8l 1 -* / »l 

increases steadily with | z | from its value — cos a for z = 0 to its maximum 
on C % (Polya and Szego, 194fi, III, § 265). Consequently for z on the 
radial sections of Of we have 



l 


z 

n 


n 


>e-1*‘ 00,0 


( 10 ) 


In order to obtain the inequality (8) from (7) after the oontour C u is 
replaoed by Gf we must now suppose that F(z) satisfies the inequalities 


| Fire * 9 )| < Kr~ * \0\ < a, J 
| ^(re ± <a ) | < ATr - * e~ r °°* “ j 


(H) 


Thus we have 

Theorem 2: If F(z) is regular for | or? z | < a(±n < a < \v) and 
satisfies conditions (11) then its Abel series converges uniformly in any 
bounded domain. 

We may note that on aooount of the Phragmen-LindelOf (1906) 



Convergence and Asymptotic Validity of AbeTs Series 


220 


inequalities, Theorem 2 implies a more stringent inequality than the 
first of (11) for Borne range of 6 near a. In fact, we oould say that a 
(truncated) indicator diagram for F(z) is at most B for Theorem 1 and 
the part of B within | arg (z +1) | ^a — jrr for Theorem 2. The argu¬ 
ment for Theorem 2 continues to apply when Jtt ^ a Jtt, but the 
hypothesis then degenerates to \F(re i9 )\<Kr~ t e~ rVm9 for |0|<a 
because of the Phragmen-LindelOf inequalities. When a < \n, as 
11 — z/n | "fl* 1 has an exponentially small minimum along arg z =■ ± a 
(on a curve which bears the same relationship to the continuation of 
C w as C, does to C w ), we would have to impose a more stringent inequality 
still. 


3. Somh Remarks on thb Laplace Transform 

In this seotion we follow the notation of Polya (1029), and make use 
of extensions of some of his results to functions which are regular and 
of exponential type in an angle (Pfliiger, 1036-6; Maointyre, 1030; 
Rabinovio, 1948). 

Let K be a dosed oonvex region containing the origin and 
h($) = Max h(<j>) is the supporting funotion of K and 

■ fZ 

x ooa <f> y sin <f> = is the supporting line 7L of normal direction 
<j> of K. K is completely determined by Each boundary point 

of K has at least one supporting line through it and eaoh supporting 
line has at least one extreme point in common with K, that is a boundary 
point whioh is not an interior point of a segment of a straight line forming 
part of the boundary of K. Let us denote the set of boundary points 
of K by C. By C(a) we denote the subset of boundary points through 
whioh pass supporting lines of normal direction <f> with — a < <j> < a. 
If < a < n, T a and T_ a will interseot the negative real axis. We 
denote the points of intersection by D a and Z>_„ respectively. They 
will intersect eaoh other in a point D, say, corresponding to the oomplex 
number z„ a= + iy 0 . Let the boundary points of K whioh lie on T, 
and T_ a nearest to D be denoted E m and E_ m . Then the join of K 

and the triangle DE m E_ a is also a oonvex set. When *< a <n, its 

supporting funotion is equal to h(<f>) when — and equal to 

ab oos <jt + y 9 sin ^ when a < <f> < 2n — a. We denote this oonvex set 
by K(a) and call D the vertex of K(a). 

Now suppose that F(s) is regular for | arg z | < a (continuous on the 
boundary) and satisfies 

\F{?)\<Ke*M .... (13) 

The Laplace transform of F{z) defined by 

/(*)— .... (14) 

pas-s.— voi* unn, a, MMKMI1, n» m. 16 



226 


A. J. and 8.8. Macintyre, Theorems an I he 


is regular for R(z) > A, and by rotating the contour in (14) from the 
positive real with the radius arg t = 6 (— a < 0 < a), /(z) is analytic¬ 
ally oontinued into the half plane R(zeP)>A and ao into a Riemann 
surfaoe whose boundary consists of the oiroular arc | z | = A (-«< 
arg z ^ a), and the two half tangents at the extremities of this aro. 
When \n < a <n the sector | arg (A seo a — z) | < a — \it appears twice 
in this surfaoe. It is dear that as | z | ->oo in the region of regularity 
of f(z), we have 

l/M|-o(|‘|) .... (15) 

uniformly for | arg z | < a + \n — 9. The inverse transform to (14) 
is 

F( * )= 2laj/ Jf{J)dJ ■ (16) 

where T is a contour consisting, to take a particular form, of a oiroular 
aro | J | = A + d, | arg J\ < Jrr -f- S, together with the half tangents at 
the extremities extending to infinity so that the real part of J tends to 
minus infinity (Macintyre, 1930). 

With our hypothesis that F(z) is regular for | arg z | < a where 
\it<a< in, (15) is satisfied and the contour oan be replaced by that 
consisting of the negative real axis taken twice in opposite directions 
and the complete circumferenoe of the oirole | J | = A 8 from — (A + d) 
back to itself. The representation (10) will then be valid for R(z) > 0. 
If now h(d) is the supporting function of the oonvex region K containing 
the origin and F(z) satisfies the inequality 

|.F(re")| <Jfr-rexp{rA(d)}, — ce<d<a, <a<fw, y > 1 (17) 

then the contour T can be distorted into the negative real a™ from 
— oo to D_«, the supporting line T_ 0 , the boundary C(a) the supporting 
line T a and the negative real axis from D a to — oo. The special case 
in which F(z) is an integral funotion was considered by S. Sohmidli 
(1942), but his proof applies equally well to functions regular in an 
angle. 

If we also suppose K to be symmetrical with respect to the real axis in 
order to obtain a simpler enunciation we have 

Lemma 1: If F(z) it regular for | arg z | < a, where \n < a <\n and 
satisfies (17) where h(0) is the supporting function of a convex region K 
symmetrical with respect to the real axis then for R(z) > 0, F(z) has the 
representation 

^(*> ~ 2tS j/'WW * * ' < 18 > 

where the contour T consists of the negative real axis taken t\cice between 



Convergence and Asymptotic Validity of Abel’s Series 227 

— oo and the vertex of K(a) and the boundary T(a) of K(a). f(z) is defined 
by (14). 

The inequality (15) is insufficient for our applications and we have to 
supplement it. The integral (18) can be replaoed by the following: 

«- h r." ^ + » l*™*' ■ • 

where <j>(J) is the difference between the value of f(J) on the negative 
axis obtained by analytio continuation from the lower half plane, say 
/_ (J), and the value obtained by analytio continuation from the upper 
half plane, say / + (J). With the notation 

f(z) — j F(t)e—dt .... (20) 

to mean that the radius of integration is arg 1 = 0, we can evidently use 
for 0(z) (z real and < — 1), the representation 

^(*)-/-(*)-/+(*)“ |f J{t)e~*dt+r' F(l)e-«dl\ . (21) 

Now if F(z) is regular for Jt(z) ^ — h(h > 0), the oombined contour in 
(21) can be distorted near the origin into a segment of the line 12(1) = — h. 
From this new representation we see that as z tends to infinity by negative 
values, the contribution of the integrals along the infinite half-radii is 

( e -kioo* a \ 

- ~—J while the contribution along the segment of 12(1) = — h 

is 0(e**) and hence 

|0(*)|-0(e*) .... (22) 


Accordingly, we have 

Lemma 2: If F(z) satisfies the hypotheses of Lemma 1 and is regular 
for R(z) > — h for some h>0, F(z) has the representation (10) valid for 
R(z) > O and <ft(z) satisfies (22), for z real and negative. 

4. The Asymptotic Validity of Abel’s Ssubs 

If we take A to be identical with the oonvex region B described in 
2, so that A(d) = b(6), and take a — \n, then R(a) is also identical with 
B. The hypotheses of Lemmas 1 and 2 now coincide with the hypotheses 
of Theorem 3. At this stage a comparison with Sohmidli’s analysis 
seems appropriate. Sohmidli (1042) assumes that F(z) is an integral 
function and satisfies (3) for all z. His conclusion depends on the repre¬ 
sentation (10) with the integral along the negative real axis missing. 
In his case the expansion of & in powers of Je J and integration term by 



A. J. and 8.8. Maeintyre, Theorems on I he 


term leads immediately to his oonolusion. We perform the same opera* 
tion. e* 7 is expanded in powers of Jt? as the series 

2 z(z — n) n ~ 1 J % e nJ /nl . . . (28) 

•-0 

This series is at the same time the Abel series of e* 7 regarded as a function 
of z and a Lagrange expansion of e* 7 regarded as a function of J. The 
resulting series 

f «—0 

is by Lemma 2 the Abel series of F(z) and will be denoted 8. 

Now the series (23) converges uniformly to e* 7 for J on the boundary 
of B, but when J lies on the segment J < — 1 of the real axis the series 
converges uniformly but not to e* 7 . It now converges to e* where 
£ef = Je J and — 1 < £ < 0 (Halphen, 1881). 

We thus have, ainoe <j>( J) satisfies (22), 

*-b/I! 'snnu+inl'f’nnu ■ ■ m 

and henoe, by Lemma 2, 

S " F{X) ~ 2 ^ f~'j* “ 2 ^ ' (26 > 

A weak form of Theorem 3 is now evident, for if B(z) ■» x > 0 the first 
integral in (26) is bounded and the seoond is 0(e~ m ). We obtain a better 
approximation to the first integral as follows. From Lemma 2, the 
integral in modulus is lees than a constant multiple of 

J 'jdtHdJ .... (27) 

and Theorem 3 will follow from 

Lemma 3: If h is real and positive t then as x tends to infinity by 
positive values 

|V—. . . rn 

where 

ptr* — her* (0</t<l<A) (29) 

To prove the lemma, first observe that sinoe 

dp — 

d* ' A (I — ^*) 


. (30) 



Convergent* ami Asymptotic Validity of Abel's Series 


229 


it follows that 

It follows from (20) that 

/*>Ae _A . 

Writing u = \ + p, v** \ — ft (20) becomes 

*(**+!) 
e® — 1 

Two differentiations of »(e® +1) — 2(e® — 1) suffice to prove that 
«(«•+ 1)>2(«*— 1) . (»>0) 


. (31) 


Henoe 


A + /*>2 


• (S3) 


Using (33), it is an elementary exercise to show that ~ is an increasing 

dA 

funotion of A. (One differentiation suffioes.) Now, mrfng this foot and 
(32) we Bee that 

decreases and therefore 


^ (— /»* — AA) > log * — A (1 ^A<logx) 

and is therefore positive in this range, for large positive x. 

We now split np the range of integration and oonaider first 


/•k* 

w, 


e-^-**dA. 


The above analysis shows that the integrand in l x < l/** +1 . Henoe 


/ i^* 5 +» SS,0 ((* l0g *)* 


Using (32) again, and making the ohange of variable 
A — log x + log log x — v 


/•log • + log log » 

e-**- AA dA 

J loga 

riot a + log log a , 

<j e— a - w <JA 


t-tx-XK dA 


'logo + log logo 





280 


A. J. and 8. 8 . Madntyre, Theorems on the 


/•log* 

< ( x log *)“* J exp (— e® -f hv)dv 

<K(x log*) - *.(85) 

Also 

/j= I* e~r*-**d\ 

J log r + log log j 

< f" e-^dX =4(z log *)~*.(36) 

J log ® + log log c * 

The lemma follows from (34), (35) and (36). 

Theorem 3 can be generalized in the same way as Theorem 2 generalizes 
Theorem 1. If we take < a <} n , and assume that F{z) is regular 
for | arg z| < a and for 8{z) > — h(h > 0) and satisfies 

, |<?|<a [ • • (37) 

|.F( f e±*>)| ^Kr-r e - Tootn ) 

then its Laplaoe transform f(z) is regular for | arg z| < n outside the 
region B(a) and by lemma 1 F(z) has the representation (10). Now in 
virtue of the second condition in (37) it follows that f(z) is regular in the 
two half planes R{ze ± <a ) > — cos a and since B is oonvex these half 
planes include the part of the boundary of B omitted from B(a). f(z) 
is continuous on the boundaries of these two half planes so the repre¬ 
sentation (19) with B in plaoe of K(a) is still valid. The rest of the 
argument leading to Theorem 3 is still valid and gives 

Theorem 4: If F(z) is regular for | arg z | < a, and for R(z) > — h 
(A > 0) and satisfies (37) where \it < a < then representing the sum of 
its Abel series by A(z), 

as x = R(z) tends to infinity by positive values. 

It may be of some interest to observe that Halphen (1881) considered 
the Abel Series 

<?(*) = £*(* + »)—. . . (38) 

of the function F(x) = —— (z real and positive) and discussed the rela- 

Z — X 

tionship between F{x) and 0(x) in detail. He proved by elementary 
means that 

/•« l —«*-* 

e^dA = -“£-+<?(*) 

«/ 1 X ~ Z 


(39) 



Convergence and Asymptotic Validity of AMs Series 


231 


where p is defined by (29). (39) is a special case of (26), with appropriate 
changes in sign, for, writing 


F(z) = f— (k real and negative) 

tC — z 


00 # 

= - 2 me-** 


Henoe (26) beoomes 


1 f -00 

S~- L —+ 

k-z J _| 


REFERENCES TO LITERATURE 

Abel, N. H., 1839. Oeuvres Completes, Christiania. 

Buck, R. C., 1948. “Interpolation Series ”, Trans. Amur, Math., lxiv (2), 
283-298. 

Oelvond, A., 1938. “Interpolation et uniciW dee fonotions entires’*, Reo. 
Math. (Mat. Sbornik), N.S., xlvi, 115-147. 

Qontchaboitt, W„ 1935. " Sur la convergence de la s6rie d'Abel", Bee Math. 
(Mat. Sbornik ), N.8., xm, 473-483. 

Halphek, 0. H., 1881. “Sur une s4rie d'Abel ”, Bull, de la Soe. Math., x, 
1881-1882, 67. 

Maointybe, A. J., 1939. “ Laplaoe’a Transformation and Integral Functions ", 
Proe. London Math . Soe., (2), xlv, 1-20. 

Pn OoEB, A., 1935-36. “ Uber eine Interpretation gewiaser Konvargenz-und 
Fortsetzungseigensohaften Dirichletscher Reihen ”, Comm. Math. HeU>., 
vxn (2), 89-129. 

PHBAOiitK, E., and Lind el Or, E., 1908. “Sur une extension d’un prinoipe 
elassique de l'analyse ", Acta Math., xm, 381-406. 

Polya, Q., 1929. “ Unterouchungen iiber Liioken und Singularity ten von Potenr.* 
reihen ”, Math. Beits., uax, 549-640. 

-, and SraoO, 0., 1945 edition, Aufgaben und Lekrsdtte, Vol. I, New York. 

Rabin otto, Yu. L., 1948. “ Invasion Formulae for two kinds of Laplace Trans* 
form ", Doldady Akad. Nauk. 8J3J3JI. (N.S.), LX, 969-972. (Russian.) 

Soedddu, S„ 1942. “ Ober gew is se Interpolatlonsreihen ", Thesis, Bunch. 


(Issued separately February 9, 1652) 



232 


D. E. Rutherford, Some Continuant 


XVI.—Some Continuant Determinants arising in Physics and 
Chemistry — II.* By D. E. Rutherford, D.So., Dr.Math., 
United College, University of St. Andrews 

(MS. received May 4, 1951. Read June 2, 1951) 

Synopsis 

Some of thu formulae obtained in tliia paper are likely to find application in 
problems concerning a rectangular lattice of “atoms**, each of which is under the 
influence of its near neighbours. Some of the determinants considered apply to 
cases in which both the nearest and the next nearest neighbours are operative. 
The inverses of certain types of matrices are found, and these may prove to be of 
value either in solving systems of linear equations such as arise in relaxation 
problems, or in determining tlie latent roots of matrices which may occur in problems 
in applied mathematics. 

§ 1. Lot P n (x) denote the square matrix 



of order n and let <f> n {x) denote its determinant. This determinant, 
sometimes called Wolstenholme’s determinant, and modifications of it 
frequently make their appearance in problems in mathematics, physios 
and chemistry. 

In a previous paper the author (1047) evaluated and factorized many 
determinants allied to <fi n (x), and certain of these have already proved 
useful to workers in different branches of science. The present note 
contains some further results concerning such determinants which, it is 
hoped, will sIbo prove to be of practical application. 

It has been decided to omit detailed proofs of many of the formulae 
obtained when these depend merely upon trigonometrical and algebraic 
manipulation; for these formulae, while of interest and use to the 
physicist and applied mathematician, have much less significance in pure 
mathematics. 

* This paper was assisted in publication by a grant from the Carnegie Trust 
for the Universities of Scotland. 




Determinants arising in Physics and Chemistry.—II 233 


We begin by gathering together the more important formulae for 

$U*)- 

</>»(*) - n t (x - 2 008 .(1.1) 


the series terminating either with a oonstant term or a term in x. 
Writing 


x=2 cos 0 = 

we find that if | x | ^ 2, then 


u + v 


( 1 . 2 ) 


and 


On the other hand, 


sin (n *+- 1)0 __ u n +1 — v n +1 
sin 6 u — v 


2 


cos nd = 


(uv) n l 2 ' 


1 

(?«>)*' 2 


(1.3) 


^ n (2) = TH-l, <6 n (- 2) = <-)*(« + 1) . . (1.4) 

As a curiosity we might mention here that i n i) is the (n + 1)“* 
Fibonacci number, for 

i*i(~ •*)“!. = 

and, by expanding the determinant $J B (— ») by the first row we obtain, 


It follows from (1.1) that the n** Fibonacci number may be expressed in 
the form 

II ^1 — 2» cos 


§2. On 
write 


T n (z, a) = 


writing z = xy + 2, a = \(x -f y), it appears that if we 

(2 — 1), 2o, 1 

2a , z, 2a, 1 

1 , 2a, z, 2a, 1 

. 1 , 2a, z, 2a, 1 


1, 2a, (z-1) J 




234 


D. E. Rutherford, Some Continuant 


then T n {z, a) = P n (x) P n (y). Accordingly, taking determinants of both 
sides, 

\T n (z, o) | = $ n (x) My) 

— II ix -2 cos ^ 71 \ IT iy — 2 cos \ 
i-l\ »» + 1/i-lV 71+1/ 

= II (z — 2 4a (ios -)- 4 cos 2 W 

i-i \ n + l » + 1/ 

= n js -- (2 + « a ) + |a - 2 cos j j . (2.1) 

Hence the roots of the equation | T n (z, a) | = 0, regarded as a polynomial 
of degree n in z are 

St = 2 -)- a* — (a —2 cos ) , k = 

\ » + 1 / 

In particular, when a = — 2, these roots are 

Jc7T 

z*= «— 16 cos 4 —- , k= 1 , . . ., »• 

2 (n +1) 

A similar argument applied to the circulant 


n / i o 2kn \ 

= II [x + 2 cos • 1, 

*-l\ 71 / 


X 

1 

1 

1 

X 1 



1 X 1 


1 


1 X 


shows that 


Z, 

2a, 

i 



1, 2a 

2a, 

2, 

2a, 1 



1 

1 , 

2a, 

z, 2a, 

1 




1 , 

2a, z, 

2a, 1 



2fl, 

1 



1, 

2a, z 

— 

6f»- 

- (2 + a*) + 

^0 + 2 008 

2Jfejr> 
n > 

n • 


Indeed, any circulant can be factorized as follows. The latent 
roots of 





Determinant* arising in Physics and Chemistry.—11 


235 


r 


«o * 

«1. 

«s. ■ 





«o> 

«1> ■ 

• •> “»-i 

= • - 


> 

«*. 

<*3. - 

• • I — 




are a^+ a^ e*" kiln + . . . + <* n -i e* n - 1)nkiln , k = I, . . n, since, by a 
theorem of Frobenius (MaoDuffee, 1933, p. 23), those of 



"o 

1 

0 

0 . . . . 

0~ 


0 

0 

1 

0 . . . . 

0 


0 

0 

0 

1 . . . . 

0 


0 

0 

0 

0 . . . . 

l 


_1 

0 

0 

0 . . . . 

0 ^ 


are e 2nki1n t ir—1, . . n. If the oiroulant is aymmotric, that is, if 
a r = a n _ r , those latent roots are 


a 0 + c^(e*Wn + e -*Wn) + ^ e 4Wn + e -*’Wn) + 

or 

, O 2AtT , „ 4/fcTT , 

2 ^ coh -\- 2a ^ cos , 

n n 


the laBt term being 2a, cos if n = 2* 4- l, or (— 1)* a,if n = 2#. The 

71 

latent roots are of course factors of the determinant. 

The evaluation of the determinant of the matrix 


q»(*> ®) 


" 2, 2a, 1 

2a, z, 2a, 1 

1 , 2a, z, 2a, 1 

1 , 2a, z, 2a, 1 


1 , 2a, zJ 


presents greater difficulties, and so far it has not been found possible to 
express the determinant as a product of factors linear in z. 

Using the addition theorem for determinants, it is easily shown 
that 

|r n | = |g,|- 2 |g n _ 1 | + |g w _,|. 


Defining | T 0 | = |g 0 | s= 1, T^z, a) = z —2, the reoiprooal relation is 

|g n | = |r n | + 2|r n _i| + 3|T n _ J |+ ... + <« + i)|*.|. 







236 


D. E. Rutherford, Some Continuant 


m , \ j. , \ j. , \ sin (n +1)0 sin (»+ 1)^ 

T n (z, a) = <j> n (y) = —-- . n ' fl . r . 

sin u Bin rjr 


where x = 2 cob 9 and y = 2 cos ^ aro the roots of the quadratic equation 

z — 2 + a 1 — (a — £)* 

in £. We may therefore write, putting 2a = 9 — 2/? = 0 + 

IP , (n + - — *) sin *0 sin (n + 2 — *)(cos2sa — cos2syg) 

n «-i sin 0 Bin ifr cob 2a — cos 2/j 


Some trigonometrical manipulation, which need not be reproduced here, 
then shows that 


[1—co» (n-f-2)0 cos (n + 2)i//] sin 0 sin $— [1—cos 9 cos 4J sin (n + 2) f sin (n+2)^ 

2 Bin 0 sin }Jr (cos 0 — cos ft)* 

1 [sin 2 (» + 2) a sin* (n + 2) /fl ^.4) 

2(cos 2a — cos 2/3) [ Bin* a Bin* fi J 


The formulae (2.1), (2.2) and (2.4) may be expected to find applica¬ 
tion in one-dimensional problems in which there are a largo number of 
uniformly spaced 44 atoms M or nodes, each node being influenced by its 
two nearest neighbours and also by its two next nearest neighbours. 
The slight differences between the throe determinants considered arise 
from different end conditions. 


§ 3. It is interesting to observe how these determinants may arise in 
relaxation problems. By Taylor’s theorem we find that 

/(- 2a) - 4/(- a) + 6/(0) - 4 f(a) +/(2o) =-- a</< lv >(0) + *»•/«*(0) + . . . 

Hence if the nodes, at distance a apart, are chosen close enough for a* 
to be neglected, then the relation 

/(_ 2a) - 4f (— a) + 6/(0) - 4 f(a) +/(2o) = 0 

is a finite difference approximation at x = 0 to the differential equation 
x) = 0. In the same way, we can show that another approximation 
to the same equation is 

«*/"(- o) ~ m~ a ) + 5/(0) - 4 /(o) +/(2o) = 0. 

It follows that if w = f(x) and if u and u" are known on the boundaries 
x = 0 and x= (n + l)a, then we have the following equations for 
“i =/(o), . . . u n —f(na), 

Si^ — 4u, + «, = 2tt0 — 0 * 1 * 0 " 

— 4 u, + 61*, — 4 i* 3 +t*4 = — 1 *q 

1*! — 4 |*, + 6u s — 4 t* 4 + tt 6 sa 0 






Determinants arising in Physics and Chemistry.—II 


237 


In the matrix form these equations become 

T n { 6,-2)u=v, . . (3.1) 

where u is the column vector of the unknown elements , u n and 

v is the column vector with elements 

2 Uq , tifl, 0, . . .,0, — tt n+l» 2l/- n + 1 O'** 1 n + l 

which are determined by the boundary conditions. 

Once we have found the inverse (§5) of the matrix T n (z, a), we can 
write down the solution of the relaxation equations in the form 

u = [T n (6, -2)]- 1 v . . (3.2) 

These results concerning relaxation equations can be extended, but we 
shall not dwell on them here. 


§ 4. The matrix T ft (z, a) refers to a one-dimensional lattice of “ atoms ” 
or nodes in which each node is affected by its nearest and its next nearest 
nodes. We shall now consider the corresponding matrix for a two- 
dimensional square lattice. It is a matrix of order mn and is of the 
form 

T m (Z nt AJ = P m [P n (x)] P m [PM 



whorB 

Z n = P n (x) PM + 2 I n = T n (z, a) + 2 I n = T n (z + 2, a), 

= *[**„(*) + PM = + y)} = PJo) 

and I H is the unit matrix of order n. 

In fact by equation (4.1) of the author’s previous paper, 


| T m (Z n , AJ | = | PJP n (x)) 11 PJP n (y)) I 

= n nix — 2 cos — 2 cos )(y — 2cos ^ —2 cos 

m+1 n + lj \ m + 1 n+l/ 

= n [z - (2 + .•) + (. - 2 00. £, - « oo. n £,)’]. 

It follows that the roots of the equation | T m (Z % , AJ | = 0 are 

2 + o* — (a — 2 cos ^ - — 2 cos ** ) , k = 1, . . m; 1=1,. . n. 
\ m+1 n + l/ 








238 


D. B. Rutherford, Some Continuant 


In particular, if a = — 4, the roots are 
lev 


18 


— 18 ^cos ! 


—f- cos 2 

2(m + 1) ~ 2(n 


lir Y k= I 
+ 1)/’ ’ 


m; 1=1, . 


n. 


A matrix which might be termed a two-dimensional oiroulant is one 
of the following type: 


(B 0 , 

By, 

.B|i ...» 

■®m — 1 


B m ~ i> 

Bo, 

. . ., 

-®m— i 

= "2 l K' m <B r >, 

*x > 

B t , 

-®3> ■ • •> 

B 0 

r—0 


where Y y denotes a direct product, K m of order m has the same 
significance as in (2.3), and where 

= a *oAt+ + • ■ ■ + ^H" 1 - 

Thus, 

f-0 1-0 

By a theorem of Stephanos (1900) the latent roots of S are 

m-1*—1 

S Sa„Ai/i;, h= 1, . . 7»; k = 1, . . .,n. 

r-0 »-0 

where A^ . . ., A m are the latent roots of K m and /t v . . /i n are the 
latent roots of K n . Since the latent roots of K n are e inki/n , k — 1, . . ., n, 
those of S are 


m-1 #i- J r L r xm 

2 2 <x r ,e^\ h = 1, . . ., nr, i = 1, . . 

r-0 f = U 

In particular, if B 0 = Z n , B 1 = B m _ 1 =2A n , B t =B m _ t 
other B r are zero, the roots of the equation 1 8 1 = 0 are 

•tit (in 2 hn . a 2kn\ i 

z k k = 2 + or — (<z + 2 cos + 2 cos I , 

\ m n / 


w. 

= /„ and all 


7/ — 1, . . ., »a; !•= 1.» 


This S is the appropriate matrix for a closed “square" lattice covering 
a torus, nearest and next nearest neighbours being considered. 

The above formulae can easily be extended to cover three-dimensional 
types of determinant related to a cubic lattice. 


§ 5. In this section we shall consider the inverses of certain matrices. 
A knowledge of these may be valuable for two reasons. First, they may 
be used to solve a system of relaxation equations such as (3.1). Secondly, 




Determinants arising in Physics and Chemistry.—II 


239 


if one of these inverse matrices arises in some physical problem we can 
at once obtain its latent roots if those of the original matrix are known. 
Let R n (x, a, 6) be the matrix 

~x-\~b 1 ~ 

1 x 1 

1 x 1 


1*1 
1 x + aj 


of order », and let us extend our definition of <f> n (x) bo that, oven when n 
is not integer, 


, in sin (n + 1)0 

0n(2 008 0 ) = — ' , 

sin u 


0 0 , 7T 


0 n (2) = » + l, ^(-2) = (-i)-(»+i) 

It can then be shown that the r,^ th element r/ n of the inverse of 
Rn(*. -4>f- (*)/&,(*)) is given by 


* = « = (- 1 ) r+, ^. + ,-i(*)^ 4- n -r (*) 

Vr#— Vnr— j > 

Ya + fi + n 


T <8 


(5.1) 


In illustration, we observe that if x = — 2, a = — /3-*-oo, then 



”l 

1 

1 . . 

. 1 


1 

3 

3 . . 

. 3 

2[R n (-2,+l,-l)]~' = 

1 

3 

5 . . 

. 5 


1 

3 

5 . . 

. 2n — 


Since the latent roots of R n {— 2, +1, — 1) are — 4 cos*[(2r — l)n/4»], 
those of the matrix on tho right-hand side of the last equation are 
£ sec* [(2r— l)n/4n], r— 1, . . n. This matrix and its latent roots 
arise in certain calculations by Duncan (1950). 

Again, putting a = /? = 0, we obtain the formula 


Vr» Vnr (" 


y+. 

' K(x) 


r<s 


for the inverse of P n (x), or R n {x, 0, 0). In particular, the inverso matrix 
of /**(— 2) is given by 


V* = 


-r(» + l— s) 
n + 1 ’ 


r <8 





240 1). E. Rutherford, Some Continuant 


This last formula was given, apart from a misprinted sign, by Todd 
(1960). 

Since P n (x) and P n (y) are commuting matrices, and since T n (z, a) = 
P n (x) PJy), where x and y are the roots of the quadratic equation 

(£ — a)* + z — 2 — a 2 = 0 

in £, we have at onoe, 

\T n (z, a)]~i = [P n (*)]-i [Pnd/)]- 1 . 

For many purposes it is convenient to leave P n _1 in this form. Thus 
(3.2) may bo written 


u = [P n (-2)]- 1 [P,(-2)]- 1 v. 


from which the components of u ore easily obtained, by premultiplying 
the known vector v by [P n (— 2)]~ J twice in succession. 

It is difficult to find the inverse matrix of P m [P n (x )] in a form suitable 
for calculation when m and n are large, but the following method oan be 
adopted when m and n are small. 

Let Q m (x) be the adjugate of P m (x). Then 


P m (x)Q m (x) = $ m {x)I„ 

Consequently 

PJW) QJP» (*)) = I n < *«(*»(*)) > 


= /. 


n jp n (*)-2cos *7 

t-i I m + 1 



It follows that 
L^nto)]- 1 




QJPJ*)) 


/ -«-(*- 

. n)- 

\*"V, (x- 


2 cos 

2 cos 


kn 
m -j- 

kn 


0 


+*{ X - 20Oa m k +l)/’ 


the matrix Q n (x) is of course obtainable from the formula (6.1). 


§ 6. In conclusion, it is interesting to observe that we oan write 


P,(z) = */* + 2 oos fc) m , 


Q n = A n ^¥ n A n , 


where 



Determinants arising in Physics and Chemistry.—11 


241 


in which 


\T/ _ ft 

n ~n+\ 



~1 

0 . 

. <f 

l 

0 

2 . 

. 0 


0 

0 . . 

. n 


and A n (= A ~*) is the matrix whowe r.s 111 element ia /\ | 

V +1/ 

To show this it i« only necessary to verify that 

A 9 PJP)A u --2oo* r 


sin 


and that 


ran 
n -j- 1 * 


It also follows easily from these formulae that the latent vectors of 
P n (x) are the columns of A n . 

It can be Hhown that the elements of («) n are given by 


0 rtt = 0, if r ~H a is even and r ^ a, = 


7T 

2 * 


0 = 


2(n -HI) 5 


(cot 2 — cot 2 ^ if r + * i« odd. 

\ (n+l)2 (n +1)2/ 


REFERENCES TO LITERATURE 

Duncan, W. J., 1950. “ Dependence of Errors m the Natural Frequencies of the 

Numbers of Elements in a Segmental Representation of a Uniform Shaft 
College of Aeronautics f Cranfield. Restart. 

MacDuffee, C. C., 1033. The Theory of Matrices, Itorlui. 

Rutherford, I). E., 1047. “ Some Continuant Determinants Arising in Physics 

and Chemistry Proc, Roy. Soc. Edin A, lxii, 220-230. 
tSTEPHANOH, 1000. “ Sur une extension du calcul des substitutions Imeaires ", 

Jotirti. de M athemtUiques, V, vi, 73-128. 

Todd, John, 1050. "The Condition of a Certain Matrix", Prnc. Catnb. Phil. 
Roc., xlvi, 116-118. 


{Issued separately February 9, 1052) 


PJLS.B.- 


r ox~ luce, a, 1050-01, past m. 


17 




242 


X. Feather, The Sargent Diagram 


XVIT.—The Sargent Diagram for the Electron-capture Process, 
and the Disintegration Energies of Heavy /7-emitters.* 
By N. Feather, Department of Natural Philosophy, Univer¬ 
sity of Edinburgh 


(MS. received May 23, 1051. lieail Nu\t*iiilH , i' •">, 1051) 

Synopsis 

A Sarpont diagram is present ml containing 12 plotted point* relative tn ciiptura- 
no live Hjiecitu in the range of atomic number (Z) [ruin 80 to 08 inclusive. Argu¬ 
ments ure adduced Uj hIiow that the “allowed " line of the diagram ib located as 
theory predicts. and the capture transformation* of other heavy capture-active 
H|HvieM nre diMCiiswwl with the aid of the diagram. In ixirtiriilHr, value*) arc 
deduced for the energies of capture transformation of 17 specie* for which 
79 - Z ■ 85, anti, taking count of these value*, the onergie* of ^-disintegration of 

150 h|hnth\m having 70 < Z < 98 art? AHStiniod known, and are suitably plotted 
again*! neutron number A 7 . Discontinuities are found, for certain value* of 
motopie number, in the region of N — 120 (mid Z — 82). Value** of a-disintegm- 
tiori energy urc iiIho deduced for certain i*otoj>e* of bismuth and loud. 

The Sahoknt Diagram 

The original Sargent diagram (Sargent, 1033) provided the first success¬ 
ful demonstration of the existence of a significant relation between 
disintegration energy and disintegration constant in //-emission. At a 
time when there was no satisfactory theory of the disintegration process, 
when the distinction between simple and complex //-spectra was at the 
best very imperfectly appreciated, and when the only /(-active bodies 
known were those of classical radioactivity, this essentially successful 
essay in empiricism was based upon 12 plotted points—of which one (for 
a” AcB) appeared obviously out of place. With the other 11 points it 
exhibited the distinction between “allowed” and “forbidden” transi¬ 
tions, which has since been amply vindicated, and it spoke in favour of 
the diagram that further investigation at an early stage (Sargent, 1939) 
showed that the apparently anomalous point was incorrectly placed os 
a result of faulty interpretation of experimental data. From the vantage- 
point of to-day we know that the surprising success of the diagram was 
to some extent fortuitous ; it derived from tho fact that in severul oases 
the /(-active bodies concerned transform predominantly according to a 
single mode, and from the now theoretically justified result that in 
//-disintegration (as opposed to a-disintegration) transition probabilities 

* This paper wa* os*i*tod in publication by a grant from tho Carnegie Trust for 
the Universities of Scotland. 



for the Electron-mpt ure Process 


243 


are not very strongly dependent on nuclear charge. Only in this way 
could significant regularities have emerged in a diagram covering the 
range in atomic number, Z, 80<Z<92. From the vantage-point of 
to-day, also, we know that the multiplicity of transition types is greater 
than the two of the original diagram (Feather and Richardson, 1948), 
but its essential features are still recognizable in its modem counterpart, 
and its influence on the development of theory cannot be overestimated. 

In relation to the electron-capture process the empiricist is still to-day 
very much in the position in which Sargent found himself in respect of 
//-disintegration in 1933. Because of the nature of the case capture- 
disintegration energies cannot be determined directly, but have to be 
deduced from indirect evidence, and, again, that evidence is at present 
available in any quantity only lor the heavy radioelcrnents. To com¬ 
plicate the position there is the prediction of theory (Konopinski, 1943) 
that the Z-depondenee of disintegration probability is more marked for 
capture-transformation than for //-disintegration. In spite of all this, 
the attempt to construct a Sargent diagram for the capture process would 
appear to be worth while. Such an attempt is made here, and one of 
the uses of the diagram so obtained is illustrated in the second part of 
t he paper. 

To the writer’s knowledge, the only previous attempt in the direction 
indicated is that of Thompson. In a short summary of his work 
Thompson (1949) gives a diagram containing 15 plotted points relative 
to heavy capture-active species (which unfortunately are not identified 
in the published account), and suggests that 11 of these points might 
define a single Sargent curve. The other 4 points appear to represent 
transitions which, in comparison with the former, are more forbidden. 
Transformation energies are calculated by the method of •*closed cycles" 
using known (or inferred) values of a-disintegration energy, and in each 
ease the point plotted represents the total energy available for A'-capture 
transformation from nuclear ground state to ground stute. The author 
recognizes the possibility that in rnuny cases the actual transformation 
energy may bo much less than the assumed energy, hut in the absence of 
information concerning associated y-ray emission he takes the bold 
step of disregarding tills possibility. The slope of his main Sargent line 
corresponds to a fourth-power law relating disintegration constant with 
disintegration energy. 

Now it is a dofinite prediction of theory that flic transition probability 
for an allowed capture transformation is directly proportional to the 
square of the energy of the emitted neutrino; thus if Thompson’s line 
is significant, either theory is incorrect on this point or no allowed transi¬ 
tions are represented on liis diagram. It would appear more likely than 
either alternative that the line on his diagram is not a significant Sargent, 
line, and that the unjustified assumption regarding disintegration energy 



244 N. Feather, The Sargent Diagram 

above referred to has resulted in the serious misplacing of several points 
on the diagram. For that reason the information available concerning 
the heavy capture-active species has been re-surveyed quite indepen¬ 
dently of Thompson's earlier work, and to the conclusions reached notes 
are appended concerning each case treated, so that the reader may 
judge for himself how much is still conjecture and how muoh established 
fact. During the last two years much has been discovered oonoeming 
the y-radiation emitted by capture-active species, but oonjeoture has 
still to play a part in any survey, as will be evident from what follows. 
Even admitting some oonjeoture, however, only 12 points appear on the 
Sargent diagram here proposed. 

As in Thompson’s survey, capture-disintegration energies are again 
deduced by the standard method of the closed deoay oyole. In this 
method, for each series of capture-active species of constant isotopio 
number, one energy of /^-disintegration and all relevant energies of 
a-disintegration must be known. These conditions are satisfied for 
some 26 capture-active species for whioh 89 ^ Z ^ 98—and, with confi¬ 
dence, for none having Z <87. The present diagram then refers only 
to 89 < Z < 98. Closing of a deoay oyole effectively gives the total 
energy E available for capture of a “valenoy ” eleotron by the nucleus. 
For K -electron capture, if W K is the if-ionization energy of the daughter 
atom, then E — W K is energy carried away by the neutrino in a ground- 
to-ground state transition, and E — W K — E' r is the energy so oarried 
away when the daughter nucleus is left excited in a state of exoitation 
energy E' r . Table I shows the values of E, W K and E' r (in keV) used 
in plotting the points on the diagram. As previously mentioned, only 


Tabu: I 


Reference, 


Hpooiefl 

E 

W M 

E'r 

fig. 1 

•Ultf 

520 ±2U 

113 

0 

1 


430± 15 

116 

0 

2 

a ::Am 

1,060 

122 

0 

3 


1,060 

122 

277 

4 


1,350 

104 

0 

5 

**"Pa 

510 

110 

0 

6 

n 

540±20 

113 

0 

7 

•l\F « 

180±100 

119 

0 

8 

a 5l^* 

170±100 

119 

0 

9 

•»?Po 

1,420±200 

110 

040 

10 

•:i*p 

2,240 ±300 

116 

1,600 

11 

*110/ 

300±200 

133 

0 

12 



for the Electron-capture Proceed 


245 


12 points are plotted, these having reference to the assumed chief modes 
of oapture transformation of 11 of the 26 species with 89 ^.Z <98 for 
which E has been deduoed. For the remaining 15 species in this group 
information regarding disintegration modes was judged too indefinite to 
enable a plot to be made. 

In the table itself horizontal lines divide the data into three groups 
according to estimated reliability—the estimate taking count not only 
of the reliability of energy determinations, but also of the degree of 
confidence in the assignment of partial disintegration constants as dis¬ 
cussed in the notes below. In the Sargent diagram the four “more 
reliable** points are represented by full circles, the five “reliable” points 
by full triangles and the three “less reliable” points by open circles. 
Notes on individual cases are as follows, r being the half-value period 
directly determined for each species, and the figure in brackets referring 
to the representative point on fig. 1. 

1 t *= 4*2 d. (1) a/capture branching ratio ~ 5 x 10“ 6 . Capture 

assumed to occur in a single mode, in spite of knowledge of 35 keV 
state in *;{Po from the disintegration *l^UY -v \Pa (Knight and 
Macklin, 1049). “ No y-rays ** reported by Crane, Ghiorsn and 

Perlman (1049). 

* J JATp. r — 420 ± 20d. (2) ac/capture branching ratio ~ 5 x 10 -5 . Capture 

assumed to occur in a single mode. Energy almost certainly insuffi¬ 
cient to excite the 410 keV state of a HU (Sullivan, Kohman and 
Hwartout, 1045) by K-capture. Excitation of the postulated 50 heV 
state (Albouy and Teillac, 1951) is not, however, ruled out. 
t— 12 h. (3 and 4) a/capture branching ratio ^ 10“ 4 . Seaborg, 
James and Morgan (104S) report a 285 1cgV y-r&y in about 60 per cent, 
of disintegrations. This is tentatively identified with the 277 keV 
y-ray emitted in the disintegration ■ J INp 1 J JPu (Fulbriglit, 1047), 
and oapture therefore assumed to be 60 j)er <*mt. to tlie 277 keV state 
and 40 per cent, to the ground state of *2;Pw. This assumption is 
roughly consistent with knowledge of the relative intensities of the 
400 JfceV and 680 keV components of the p.spectrum of * JJNp. 
t 2*0 h. (5) a/capture branching ratio ~ 1/9. Capture assumed 
in roughly equal proportions to the ground state and to the state 
(or states) at ~ 85 keV ; compare the a-disintegration %RdTh-+ 
■J *ThX. Only one point (for tlie ground-to-ground state transition) 
is plotted on the diagram. 

1 J *Pa. t = I'M. (fl) a/capture branching ratio ~ 1/99. Capture is assumed 
in 60/40 ratio between transitions to the ground state and to the 
80 keV state (Studier, 1947) of Point for former transition 

alone is plotted. Possible excitation of the 310 keV state of ■* JT/t is 
neglected. 

"21 U. T * 9*3 ± 0*5w. (7) a/capture branching ratio ~ 4/1. Nothing is 

known about the excited states of 1 J JPo, btit on account of the small¬ 
ness of the disintegration energy it is provisionally assumed that 
capture is by a single mode to the ground state of that nucleus. 
t » 9*0 ± 0*55. (8) a/capture branching ratio ~ 1/30. Nothing is 

Known about the excited states of 1 J J Np % but the disintegration energy 



46 


N. Feather, The Sargent Diagram 


k again very small, and capture by a single mode to the ground state 
is onee more postulated. 

*j \lPu, t ■= 40d. (0) a/capture branching ratio very small. James, Thomp¬ 

son and Hopkins (1048) report “no y-raysDisintegration energy 
in very small and capture is therefore assumed to bo by a single mode 
to the ground state. The possibility of excitation of the 60 keV state 
of (Melander and SlAtis, 1948; Seaborg, James and Morgan, 

1048) is neglected. 

% \° x Pa> t — 17d. (10) p/capture branching ratio 1/0, a/capture branching 
ratio ^3 x 10 - * 1 . Osborne, Thompson and Von Winkle (1046) report 
y-ray of 940 keV energy. Capture assumed, very tentatively, to be 
to the 040 keV state of * j)°Jo in about 70 per cent, of disintegrations. 
t = 4*5 d, (11) a/capture branching ratio vory small. Junes, Florin, 
Hopkins and Ghiorso (1048) report a y-ray of 1-8 MeV energy. This 
is tentatively identified with the 1-0 MeV y-ray emitted in the disinte¬ 
gration (Bra<H and Scherrer, 1945, 1946), and a large 

fraction of the capture disintegrations is assumed to bo to a 1*6 MeV 
state of * l J U in this case. 

■J *0/. x ~ 45 m, (12) a/capture branching ratio probably somewhat greater 
than 1. Nothing is known about the excited states of JBfe, but the 
smallness of the disintegration energy makes it likely that capture is 
by a single mode to the ground state. 


The Sargent diagram for the 11 species treated in the notes is given 
in fig. 1. Obviously the point for *££17 is the most well-attested point 
for a transition which can provisionally be regarded as “allowed”. On 
that assumption the straight line 0 is drawn with the theoretical slope 
of 2 through this point. It will be seen that within the indicated un¬ 
certainty of present experimental knowledge the points for *££<7/ and 
**Pu also lie on this line, and that no points lie significantly above it. 
Further, the throe species ^Pu and *£££7 are the only even-even 
species represented on the diagram. It is characteristic of the Sargent 
diagram for the heavy /9-active species, also, that the points for even-Z 
bodies in general are concentrated along the upper curves of the diagram. 
In fig. 1, when line 0 has been drawn, 8 of the remaining 9 points can be 
accounted for, almost within the assigned limits of experimental uncer¬ 
tainty, by the straight line II. The slope of this line is considerably 
greater than 2—as befits the line for a forbidden transition—and it has 
been labelled II rather than I on the assumption that the first-forbidden 
line (as in the Sargent diagram for heavy /9-emitters) is in fact parallel 
to the allowed line and not very far removed from it, and the second- 
forbidden line is the first to show a greater slope (and, in a definitive 
diagram, to be curved rather than straight)*. Beyond line II on fig. 1 

* The term* “ firet-forbidden ", " second-forbidden ", etc., an used throughout 
this paper entirely empirically. They correspond to a classification more directly 
in relation to spin .change (by 2 units for second-, by 3 unite for third-forbidden 
transitions, etc.) than to parity change. 



for the Electron-capture Process 


247 


the point for *^Np is well attested, and is very probably indicative of 
the position of the third-forbidden line shown tentatively as III. It 
may be noted here that the points for ^Pa and ^Np would both lie 
nearer to line III (and in faot rather to the right of it) than to line II, 
if ground-to-ground state disintegrations were postulated as tho pre¬ 
dominant modes in these two oases. Only further study of the inten¬ 
sities of the y-radiations concerned can settle these ambiguities. 

As regards the absolute position of the allowed line 0 on fig. 1, the 
approximate method of calculating values of the “comparative half- 
life”, ft, of the Fermi theory of /^-disintegration reoently given by 
MoszkowBki (1051) has been used to compare the degrees of forbiddennesB 
represented by this line and by tho allowed line on the diagram for 
/7-emitters having 81 ^Z ^ 83 given by Feather and Richardson (1948). 



Fin. 1.—Sargent diagram for capture-active species having 80 < Z < 08. 

i partial disintegration constant for if-electron capture (in sec.—*). 

E — Vj — JS' r : neutrino (transition) energy (in MeV). 

Taking 93 as the effective value of Z for the purpose of fig. 1 of this 
paper, and Z = 82 for the Sargent diagram previously published, we 
obtain log u (ft) = 4*6 for the line representing the capture disintegrations 
whioh we now assume to be allowed disintegrations, and log 10 (ft) = 5*2 
for that representing those ^-disintegrations generally accepted as cor¬ 
rectly so classified. The agreement between these two values is very 
olearly encouraging*. We can reasonably conclude that the allowed 
line has been fixed with fair precision on fig. 1 . 

Further support for the correctness of our identification is obtained 
from a consideration of the capture transformation of the 15 species 
having 89 ^ Z ^ 98 for whioh evidence regarding disintegration modes 

• It is possibly significant that the former value is lees than the latter, since 
the partial disintegration constants here used in fact include contributions from 
£-, Af-eleotroneapture, etc, 



248 


N. Feather, The Sargent Diagram 


wa« considered too indefinite for use in the construction of fig. 1. For 
some of these species capture no doubt proceeds predominantly by 
ground-to-ground state transitions. If we make this assumption for 
each in turn we effectively place the representative point on fig. 1 as 
high on the diagram, and as far to the right, as it oan possibly be placed. 
In this way we find that no new point is placed to the left of the line 0, 
and that three points fall sufficiently near to this line (though to the 
right of it) to suggest that the species in question transform, in a major 
mode in each case, by allowed (or first forbidden) transitions to the 
ground state, or to a low-lying exoited state, of the product nucleus 
concerned. The three species for which this behaviour is indicated are 
2 2<j Th, 2 2 |Np and Only in one respect do these assignments run 

counter to reported fact: it would appear impossible, if they are accepted, 
that there should be any prominent component of “hard” y-radiation 
associated with the capture transformation of ^Np, as has been suggested 
(Magnusson, Thompson and Seaborg, 1850). 

Concerning the capture transformations of tho remaining 12 species 
in this group nothing very definite oan be said so long as our simplifying 
assumption is retained. Probably more by coincidence than otherwise, 
the representative points for the capture transformations of and 
*UPo then fall on or very near to line II and those for *5?Po, 2 Jj \Am and 
^Cm fall on line III, but when more is known of partial disintegration 
modes in these oases it may well appear that the ground-to-ground state 
transitions are more forbidden than these plaoings would indicate, and 
that the main disintegration mode in each case is to an excited state of 
the product nucleus and is less forbidden than here supposed. Our 
conjectured placing is most likely to prove correct for ^4c, since the 
energy available for excitation of the product nucleus is small, and 
probably least likely to be significant for Fa (disintegration energy 
1-92 MeV), because it is known from a study of the /?• and y-radiations 
of *j&MeTh t that many low-lying states of ^RdTh are available for 
excitation by the capture process. With ^Am, capture transformation 
predominantly to the 1-2 MeV state of revealed in the /7-disintegra- 
tion of S J|iVp (Jaffey and Magnusson, 1949) would be represented by a 
point falling closely on line II, and a point falling on the same line would 
represent the capture transformation of 1 $ r Am, if, as has been suggested 
(Seaborg, James and Morgan, 1948), the predominant mode in this oase 
results in excitation of a state of 1-3 MeV energy in ^Pu. Possibly 
these two transformations are those which occur in fact, and are correctly 
described as second forbidden. 

The Disintegration Energies of Heavy .(^-emitters 

In any survey of nuclear stability, one profitable method of exhibiting 
stability limits is to plot /^-disintegration energy, Ef, against A (mass 



for the Electron-capture Process 


249 


number), N (neutron number) or Z, and to draw in the various smooth 
curves for A — 2Z (isotopic number) constant. This is the method 
employed by Saha and Saha (1946). Suess (1961) has recently pointed 
to marked discontinuities in the curves of Ep against N for A — 2Z — 13 
and A — 2Z = 26, occurring at N = 60 and N = 82, respectively, and 
has suggested that a similar discontinuity in the appropriate curve 
probably oocurs at N = 126. Hitherto data for the construction of the 
curves of Em against N in just this region have been lacking through 
want of knowledge of the energies of capture transformation of certain 
“key” species. Fig. 1 enables us to make predictions regarding these 
energies, and the present section will review these predictions, and 
discuss the curves exhibiting Ep as a function of N which may be con¬ 
structed on the basis of the values so obtained. The capture-active 
species in question are l *Au, ™\Tl, ™Tl, \Tl, *$Pb, ™Pb, **P6, 

*&Bi, sgjPo, ^Po, *5Po, lAt and *&4t. Within the 
accuracy required, fig. 1 may be regarded as applicable over the extended 
range of Z values, 79 <£^98, although account should striotly be 
taken of a general shift of the lines of the figure downwards, as Z de¬ 
creases. For the allowed line 0 the theoretical downwards shift, for a 
decrease in Z from 93 to 83 is of the order of 0*16 in log^A^. 

Table II sets out the estimated values of E for the oaptuie-aotive 
species in the above list (these are the values of the negative energy of 
/^-disintegration for the respective daughter products) and the notes 
below discuss individual oases as before. One general principle may be 
stated at the outset. Two of the species ('nAu and Tl) are of even 
A and odd Z and are both /?- and capture-unstable. Since in each case 
both modes of disintegration give rise to daughter products which are of 
even A and even Z, it is assumed, for each species, that the ground-to- 
ground state transitions (/?- and capture transitions) are of the same 
degree of forbiddenness. 


Tablk If 


Species 

E(MeV) 

<0*36 

a Si Pi 

0*76 

1*47 

2 SIP' 

0* IH 

•sip* 

0*4±0*3 

1 O .1 J*fp 

«r 1 
0-61 

Specie*) 

E(MsV) 

*°\Pb 
0*2B±0* 2 

•s ;» 

2*7iO*2 

•SIP* 

1*95 

’-Zb’ 

1*84 

•S;zk •;;»,• 
2*59 <- 2*70 

O-fl 

Species 

E[MeV) 

•SIP" 

1 *584:0 • ] 

•SIP" 

< 0*2 

•ss^ 

3* 14:0*3 

a i i At 

0-9tt 




Notes on individual species are as follows (here E' is written for 
E - E ' r ): 

' * \Au. t as 2-09d. p/oapture branching ratio > 280 (Renard, 1949). Since 
the ground-to-ground state p-disintegmtion is probably third* 



260 


N. Feather, The Sargent Diagram 


forbidden, the same is assumed of the ground-to-ground state capture 
transformation (r. rmp.). 

a J { TU t = 72A. It is assumed that the predominant mode involves a second- 
forbid don capture transition with R' - Wx ** 0*47 MeV to an 
excited state of 1 JJ l n Hg having E' r — 0*21 MeV (Neumann and Perlman, 

1900) . 

■JfTJ. t — 13d. It is assumed that the predominant mode involves a third- 
forbidden capture transition with R ' — Wx = 0-99 MeV to an 
excited state of •JJWg having E' r = 0*40 MeV (Maurer and Ramin, 
1942). 

"St Tl. t - 2-7y. S/capture branching ratio > 50 (estimated from intensity 

of soft quantum radiation as measured by Evans, 1900). It is 
assumed that the ground-to-ground state enpture transformation is 
third-forbidden as the ground-to-ground atato (3-disintegration appears 
to be (v. slip.). 

■JJJP6. t assumed long. Degree of forbiddenness assumed equal to or less 
than three. 

t = 02 h. It is assumed that oxoited states of ■ J*Ti having R f r — 0*47, 
0*28 MeV are both excited in this transformation (Lutz, Pool and 
Kurbatov, 1944; Sl&tis and Siogbahn, 1940). Transition to the 
0*47 MeV state is assumed allowod with R f — Wx =- 50 keV. 

■JJP6. t assumed very long. Degree of forbiddenness assumed equal to or 
less than three. 

■J JB* t •= 12A. It is assumed that the predominant mode involves a third- 
forbidden capture transition with E/ — Wx ** 2*4 ± 0*2 MeV to an 
nxoitod state of * JJP6 having E\ = 0-217 MeV (Sunyar, o t at., 1900). 
t — 14'5d. It is assumed that the predominant mode involves an 
allowed capture transition with K' — Wr = 25 keV to an oxoited 
state of *°JP6 having E\ = 1-84 MeV (Karraker aud Templeton, 
1950). 

■JJJBt. t 6-4 d. It is assumed that the predominant mode involves an 

allowed capture transition with E' — Wx —36 keV to an excited 
state of *2JP6 having E' r --- 1*72 MeV (Alburger and Friedlander, 

1901) . 

• JJBt. t 60 y. The value of E in this case is very insensitive to the degree 

of forbiddenness assumed for the predominant capture mode, so long 
as this is to an excited state of * J \Pb having E' r =■■ 2-49 MeV (Neumann 
and Perlman, 1001). For sake of definiteness a value E ' - W K » 12 
keV (second-forbidden transition) is here assumed. 
t probably very long. It is assumed that the disintegration energy 
available is insufficient for the exoitation of the 2-62 Me V state of 
t ^Pb by a K-capture transformation (Neumann and Perlman, 1901). 
t = 9d. a/capture branching ratio ~ 1/9 (Howland, Templeton and 
Perlman, 1947). It is assumed that the predominant mode involves 
an allowed capture transition with R* — Wx — 27 keV to on excited 
state of having E% = 08 AfeP (Templeton, Howland and 

Perlman, 1947). 

a SJPo. t « 5*7A. It is assumed that the predominant mode involves an 
allowed transition with E' - Wx = 0-17 MeV to an excited state of 
4 having F/ r =*. 1-3 Af«F (Templeton, Howland and Perlman, 

1947). 



for the Electron-capture Process 


251 


1 j;Po. ? « 2'03y. a/capture brandling ratio > 50 (Templeton, 1950). It is 
aanuned that the ground-to-ground state capture transformation is of 
order of forbiddenness not greater than threo—unless the ac/capture 
branching ratio is in fact much greater than the limit given. 
t =-= 5*5 h. a/capture branching ratio ~ 1/20 (Barton, Uhiomo and 
Perlman, 1951). It is assumed tliat the predominant mode invnlvos 
a third-forbidden transition with E — W K — 3-0 ± 0-3 AfeV to tho 
ground state of i J*Po. 

t « 7-5A. a/oapture branching ratio 40-9/5W-1 (Neumann and Perl¬ 
man, 1951). It is assumed tliat the predominant mode involves a 
second-forbidden capture transition with E — IF*- — 0-87 MeV to 
the ground state of *\\Pa. 

HiV 



Fig. 2.—0-disintegration energy, Efi, as a function of uoutrou number, N , for aperies 
of odd A having isotopic numbers from 41 to 53 inclusive. Species of odd Z 
represented by open circles, those of even Z by full circles. 


The amount of conjecture still remaining in our conclusions lias 
already been alluded to, and it will be evident from the above notes that 
it is not small, but that it is not entirely random conjeoture will be 
appreciated from the fact that it has been subjeot to the following cheeks. 
First, in oertain oases the method of the closed deoay cyole provides a 
natural oheok. Thus from a knowledge of a-disintegration energies we 
have (in MeV) 

Ef&At) = E(%Bi) - 1 -58 ± 0-02 
Ef%At) - E(*&Bi) + 0-80 ± 0-03, 

whereas from Table 11 we have — 1*63 and + 1* 16 ± 0*3 for the numerioal 
terms in these two equalities. Any ohoioe of possibilities, consistent 



252 


N. Feather, The Sargent Diagram 


with the validity of our Sargent diagram, other than the ohoioe indioated 
in the notea, would not have resulted in such good agreement. 

The second check on conjecture is that when resulting values of 
— E arc used in constructing curves of against N for A — 2Z constant, 
these curves shall be as smooth as possible. We pass therefore, directly, 
to a consideration of these curves, always remembering that this criterion 
has been adopted in deriving some of the data used in their construction. 
The curves for odd values of the isotopic number A — 2Z from 41 to 53 
inclusive are given in fig. 2, those for even isotopic number from 40 to 52 
in fig. 3 or fig. 4 depending upon whether the charge numbers of the 
/^-active species involved are odd or even. In terms of proton-neutron 


M»V 



Fra. 3 .—Eg m a function of N for apeoiea of even A and odd Z. Imo topic number* 

from 40 to 08 inclusive. ' 


parities, fig. 2 refers to the /^-disintegrations of odd-even and even-odd 
nuclei, fig. 3 to the ^-disintegrations of odd-odd nuclei, and fig. 4 to 
those of even-even nuclei. It will be realized that some of the values 
plotted in these figures depend upon measurements of fi- and y-ray 
energies directly, and others to a greater or less degree upon the use of 
the method of olosed decay cycles with directly measured, or with inferred, 
values of a-diaintegration energy. Where the estimated uncertainty of a 
plotted point is greater than ± 0*2 MeV the assumed magnitude' of the 
probable error is represented graphically in the usual way. In a few 
cases total disintegration energies have had to be assumed without fully 
convincing evidence from /S —y correlation studies or otherwise—and 
one a-disintegration energy is still conjectural. Details of these oases 
are given below, 


for the Electron-capture Process 


253 


f J \RoC. Total disintegration energy taken as 3-17 MeV, rather than 3-78 MeV. 

Determination of for f JJi4o as M7 ± 0-07 MeV (no Y-raya) by 
Hall and Templeton (1050) makes this the more probable assumption. 
■J \MsTKg. Total disintegration energy assumed 2*0 ± 0*2 MeV. 

Total disintegration energy assumed 0-58 MeV (Elliott, reported in 
Nuclear Data* NB8 409, 1900). 

* JJPa. UZ being the ground-state isomer, total disintegration energy taken 
as 2-32 — 0*39 = 1*93 MeV (Bradt and Scherrer, 1945). 

Total disintegration energy ossumod 0*60 ± 0*05 Me V (see Molander 
and Slatia, 1948). 

■JJATp. Total disintegration energy assumed 0*68 Me V (see Fulbriglit, 1947). 
■JJBt. a-disintegration energy assumed 3*3 ± 0*3 MeV. 

MtV 



Kio. 4.— JCp e» a Amotion of N for speoisa of even A sad even Z. Isotopic mi inborn 

from 40 to 52 inolusive. 


Reviewing flgB. 2-4, it is obvious at first sight that discontinuities 
occur on each figure in the region of N = 126, as suggested by Suess, 
but closer inspection shows certain unexpected features. Thus on 
fig. 2 (odd-even and even-odd /7-emitters) the discontinuity is evident 
only on the curve for isotopic number 43, the neighbouring ourves for 
A — 22 — 41, 45 showing no signs of irregularity over the same range 
of N.* As is emphasized by comparison of the actual curve of 
against N for A — 22 = 43 with the dotted ourves, the.sense of the 
irregularity iB that the /^-disintegration energies of ™Tl, *§!j Pb and 
are all low, and those of and *$%Em both high, in comparison with 
what might have been expeoted by simple interpolation. (The fact that 
all these species are /9-stable—with the somewhat doubtful exception of 

* The curve for A — 2Z = 41 (Z even) allows a marked step between Ob 
(N = 117) and 1 JJ.TV (tf — 119), which would make it appear that N = 118 is a 
minor “ magic ” number. 


264 


N. Feather, The Sargent Diagram, 


*tldl —does not make this statement any less significant.) Abnormally 
high /9-stability, therefore, is characteristic of the species (Z = 82, 
N = 126), (Z = 83, N = 126) and (Z = 81, N = 124)—here listed in order 
of decreasing abnormality, and abnormally low /9-stability characteristic 
of (Z = 86, IV =128) and (Z = 86, N = 129). The /9-stability of all 
other odd-A speoies represented in fig. 2 (197 A ^ 243) is normal— 
except that it appears to be exceptionally low for ^Tl (Z =81, N = 128) 
and (Z = 81, N = 130). Possibly these last abnormalities are 

indicative of the occurrence on the curves for A — 2Z = 47 and 49 of 
discontinuities similar in form to that exhibited for A — 2 Z = 43, but 
it is likely to be some considerable time before this possibility can be 
checked against experimental fact. Meanwhile the strict normality of 
the ^-disintegration energies of all speoies having A — 2Z = 46 is the 
more remarkable. Obviously the incidence of shell-closure both at 
Z = 82 and at N — 126 complicates the position in respect of the species 
now under discussion, but the fact that the //-disintegration energy of 
'^]Tl (Z = 81, N = 126) is normal whereas that of (Z = 82, N = 126) 
is some 2*3 MeV low, that Ej, for ^Pb (Z = 82 , N — 127) is normal and 
for (Z = 83, N = 126) some 1*6 MeV low, is at first sight surprising. 

It seems that it can only be explained on the assumption that the dis¬ 
continuity in these cases is determined more by the change in Z than 
by the change in N. 

On fig. 3 (odd-odd nuclear species) the only marked discontinuity 
appears on the curve for A — 2Z = 42. Here *'J;At (Z =86, N= 127) 
is abnormally /9-unstable (to the extent of about 1 MeV) and *£|B* 
(Z = 83, N = 126) abnormally ^-stable (by about 0*6 MeV). A more 
complicated situation is revealed by fig. 4. Both for A — 2Z = 42 and 
for A — 2Z = 44 abrupt decreases in /9-stability are indicated between 
N = 126 and N = 128 (as between 2 £Po and 2 ^Em and between *g|P6 
and ~&Po, respectively)—that is over the range of neutron numbers 
which includes the “magic” number 126—but there is the added 
suggestion that for A — 2Z = 42 the species having N = 120,122 and 124 
are unusually /9-Btable, and for A — 2Z = 44 that those having N = 130, 
132 and 134 are unusually /9-unstable. This represents a wider range of 
influence of neutron shell-closure than is otherwise indicated. Beyond 
this remark, however, it does not appear profitable here to pursue there 
purely qualitative considerations farther. 

Tun UDISINTKU RATION ENERGIES OK CERTAIN ISOTOPES 

of Lead and Bismuth 

It has already been stated that, for the purpose of calculating certain 
energies of capture transformation, the energy of a-disintegration of 
has been Resumed to be 3*3 ±0*3 MeV. This “-"iH value, 



for the Electron-capture Process 


256 


together with the known energies of /^-disintegration of a J£P6 and 
leads to a value of 2*4 ± 0*4 Me V for the energy of a-disintegration of 
*s*P6. In a similar way, by the method of closed decay cycles, wring 
the energies of /^-disintegration plotted in figB. 2-4, we obtain wnmgie« 
of a-disintegration of *^Bt, Bi, *{gP6 and 2 gP6. For these calculations 
the known onergies of a-disintegration of ^Po and *gJPo (Kanaker, 
Ghioreo and Templeton, 1951) serve as starting material. Table III 
oolleots the values so obtained. 


TaUM ill 


Npooieu 

a-disintegration 
energy (MeV) 


lo*o: 


3 




t o 
(I 


ipb 




(3-3±0-3) 4*3±0i 4-25±U*l (2*4J:0’4) 20±0-3 


Species * J J Pb 

a-disintegration 1 * 17±0* lf> 
energy (Me V) 


The surprising feature of the results given in this table is the smallness 
of the a-disintegration energies of the isotopes of lead (and particularly 
of This feature has not hitherto been detected or predicted by 

the systematizes (Pryee, 1950). 


REFERENCES TO LITERATURE 

Albouy ami 'J'killac, 1951. C . R . Acad . Eei ., Paris, ccxxxn, 326. 
Aluuroeb and Fwkdlandkh, 1951. Phys. Rev ., lxxxi, 523. 
Barton, Ghiobso and Perlman, 1951. lbid. f lxxxii, 13. 

Bradt and Mcherrer, 1945. Hclv . Phya . Acta , xvm, 405. 

- ( 1940. Ibid., xix, 307. 

Crane, Ghiobso and Perlman, 1049. Unpublished. 

Evans, 1950. Proc . Phya. Eoc „ A, Lxnx, 675. 

Feathkr and Richardson, 1948. Ibid., lxi, 452. 

Fulb right, 1047. MDDC 753. 

Hall and Templeton, 1950. UCRL 057. 

Howland, Templeton and Perlman, 1947. Phya. Rev., lxxi, 552. 
Javtby and Maonvsson, 1949. AECD 2636. 

James, Florin, Hopkins and Ghiobso, 1948. NNES 14B, 22.8. 

-, 'Thompson and Hopkins, 1048. NNES 14B, 22.10. 

Karuakeh, Ghiobso and Templeton, 1051. UCRL 1091. 

-and Templeton, 1950. UCRL 040. 

Knight and Maoklin, 1040. Phya. Rev., lxxv, 34. 

Konofinski, 1043. Rev. Mod. Phya., xv, 209. 

Lutz, Pool and Kurbatov, 1944. Phya. Rev., lxv, 61. 
Magnusson, Thompson and Heaboru, 1050. Ibid ., lxxviu, 303. 
Maurer and Rahk, 1942. ZeiU. Phyaik, cxix, 002. 

Melandea and Slatis, 1948. Phya. Rev., lxxiv, 709. 
Mohzkowhki, 1951. Itrid ., Lxxxn, 35. 

Neumann and Peulman, 1950. Ibid., lxxviii, 101. 

-1951. Jbid Lxxxi, 058. 



256 N. Feather, The Sargent Diagram/or the Electron-capture Process 


Osborne, Thompson and Van Winkle, 1940. NNES 14B, 19.11. 

Pbyce, I960. Proo. Phys. 8oc. t A, Lxm, 092. 

Rbnard, 1949. O.R. Acad. Sci., Faria, ccxxvm, 387. 

Saha and Saha, 1940. Trans . Nat. Inst . 8ci., India, 11, 193. 

Sargent, 1933. Proc. Roy. 8oc. t A, cxxxix, 059. 

-, 1939. Can. Joum. Research ., xvn, 82. 

Sxaboro, James and Morgan, 1948. NNES 14B, 22.1. 

Slatis and Sieqbahn, 1949. Arkiv Mat. Astron. Fysik , xxxviA, 21. 

Htudier, 1047. AECD 2444. 

Sullivan, Kohman and Swartout, 1945. Plutonium Project Report, HEW 3, 
1030. 

Sunyab, Alburoeu, Frtedlandkr, Goldhabkh and Hchakfy-Goldhabeh, 1950. 

Phys. Rev., lxxvui, 320. 

Templeton, 1950. XJORL 010. 

- -, Howland and Perlman, 1947. Phys. Rev., lxxii, 758. 

Thompson, 1949. Ibid ., lxxyi, 319. 


(Issued separately February 9, 1952) 



Proof of the Prime Number Theorem 


267 


XVIII.—The Elementary Proof of the Prime Number Theorem.* 
By E. M. Wright, University of Aberdeen. 

(MS. received June 4 t 1901. Read November 0, 1901) 

Synopsis 

The author preeente a modification of the recently discovered 11 elementary 11 
proof of the Prime Number Theorem. Nothing is assumed from the theory of 
numbers except the Fundamental Theorem of Arithmetic. In the second part 
of the proof the elements of the integral calculus are used to make clearer the 
banc ideas on which this part depend. 

1. The Prime Number Theorem asserts that tt(x), the number of primes 
less than or equal to x, is asymptotic to xj log x as x-*oo. This was 
first proved (independently) by Hodamard (1896) and do la Vall6e 
Poussin (1890). Many authors shortened and simplified the proof, but 
it was not until very recently that ErdOs and SeJberg (ErdOs, 1949) and 
Selberg (1949) gave proofs which are “ elementary ” (in the technical logical 
sense). Each of these proofs depends on a very striking inequality due to 
Selberg and on a few well-known theorems in the elementary theory of 
primes. Von der Corput (1949) presented the ErdOs-Selberg form of the 
proof ab initio, giving classical proofs of the known results which he requires. 
Here I present a modification of Selberg’s proof, in which I assume 
nothing from the theory of numbers, except the Fundamental Theorem 
of Arithmetic that every integer n > 1 is the product of primes, 

»=Pi a, p«* 2 ■ • Pk k ( ?i<p»< ■ ■ ■ <Pk >. (i.i) 

in just one way. (This is, of oourse, implicitly assumed by all the authors 
mentioned.) But, apart from certain trivialities, I do not use the 
classical proofs of the known results required (especially (4.3) and (4.4) 
below), but obtain these as by-produots of the main argument. From 
“analysis” I assume the well-known result 

L 1 = log a: + y -f 0 (*) (*2*1) (1.2) 

but less would suffioe (see § 10). 

Proof of Sblbrbg’b Inequality 

2. In what follows m, », d, k, . . . denote positive integers, p (with 
or without a suffix) always a prime number and x and y any real numbers 
not less than 1 . The 0 () notation refers to the passage of x to infinity. 

* This paper was assisted in publication by a grant from the Carnegie Trust 
for the Universities of Scotland, 
rxu— vol. Lxm, a , 1960-81, ta bx m. 


18 



268 


E. M. Wright, The Elementary 


As usual, [z] is the greatest integer not exceeding z, and d | n denotes 
that n = dk for some integral k. The MObius function /i(n) is defined 
by: 

^t(l)=l, MPiPt ■ • .P*) = (-i)*. 

provided all the p are different, and p(n) = 0 if » is divisible by any p*. 
We write also 

A(») = log p (n = p°), A (n) = 0 

for all other n and 

^(z) = 2 A(»). 

*<* 


If n > 1 has the form (1.1), the only divisors d of n for which ji(d) #0 
are 1 , p l . p k , p^ (i =£j), p l p i p l . etc., and bo 

2 p(d) = n(\) + 2 /i(p t ) + 2 pip#,) + . . . 

4|* 

-‘-(f)+(*)-(3) + • - “i 1 -')*- 0 - < 2 1) 

On the other hand, 

2 p(d) = /t(l) = 1. (2.2) 

J|i 

Also, 

t u, k 

2 A (d)= 2 2 A(p i a )=s 2 a t log p i = log n. (2.3) 

dim i-l •-1 i-1 

By (2.1), (2.2) and (2.3), 

- 2 /*(d) log d ~ 2 fi(d) log (") = 2 p(d) 2 A(A) 

4|* 4|* W 4jn A( * 


= 2 A(A) 2 n(d) = A(»). 
*1* 


Henoe 




2 A(d)A (”\ = - 2 A(d) 2 /t(A) log A 

*i« W 41* hi* 


= — 2 ft(h) log A 2 A(d) = — 2 /»(A) log A log 


A|» 

= A(n) log » + 2 11 (d) log* d, 

dim 

Hence, if n > 1, 


A|» 


(2.4) 


( 2 . 8 ) 


2 ft(d) log* - 2 log x 2 n(d) log d + 2 /t(d) log* d 

41* \®/ 4]* 4]* 


= 2A(n) log x — A(n) log » + Z A(d) A (- 


4 |* 



Proof of the Prime Number Theorem 


while 

Henoe, if we write 

we have 


£ fi{d) log* (j) = log* x. 


<|i 


S(x)=X £ /i(d) log* (*V 

»<( 4\n \af 


8(x) = log* x + 2 rjr(x) log x — £ A(n) log n + £ A(d) A(Jt) 

><i e<c 

= log* * + ft(x) log x + £ A(») log -f- £ A(m) A(n) . (2.6) 

■<* W ««<» 


Since no term in the last expression can be negative, we have 

}Jr(x) log x ^S(x). 

3. If xe~ k ^ n < xe 1 ~ k , we have 

and so, if A > 0, 


(2.7) 


k — l < log I - J < k 


£ log* 
*<■ 


< £ ifc* £ \<x £ k*e l - k = 0(x). 

Pi *> 1 a^oxa 1 -* t-1 


If we put h = 1, this gives us 

£ log n = [x] log x -f- 0(x) = x log x + 0(x). 


«<* 


(3.1) 


(3.2) 


Among the numbers 1, 2, 3, . . ., [x], there are just [x/p] multiples 
of p, just [x/p*] multiples of p % , and so on. Hence [x]! is divisible by 
p just j times, where 


Hence 




^ log n = log ([x]!) — £ iog p (|^] 4- [^j + • • •) = ^ ? A(»). 


If we remove the square brackets we introduce an error of at most 
Henoe, by (3.2), 


£ A(») = rjr(x). 

»<• 


x £ ^5? = x log x + 0(x) 4* 0(ft(x)). 

«<a n 


(3.3) 


4. By (2.1) and (2.2), 


fi 


S(x)-y*=£ £ p(d) ;iog*m- r »: 


*<* din i 

- i m lx 

*<* 


los’lil-W. 



260 


E. M. Wright, The Elementary 


sinoe there are [xjd] multiples of d which are If we remove the 
square brackets the error is at most 

«J l ° g '(<0 +r, i“ ow 

by (3.1) with h = 2. Henoe, by (1.2), 

lx 


S(x) — x?L 

Ki d 


log* 


-y'\ +0(x) 


— X £ 1 £ /i(d) {log (xjd) — y} + 0(x) 

DO* 1 *l» 

= * log x + x £ + 0(x) 

n<s It 

by (2.1), (2.2) and (2.4). Henoe, by (3.3), 

8(x) = 2x log x 4- 0(x) 4- 0(\jr(x)}. 
By (2.7) and (4.2), we have 

rjf(x) = 0(x) 

and so (3.3) gives us 


Also, as in (3.1), 


S A f™ ) = log as 4- 0(1). 
«<* n 


*<( 


£ A(»)log -) < £ k £ A(n) < £ tojr(xe'- k ) 


,n/ h> 1 w-*<«l<a , -* 


*>1 

0(x £ ke 1 ~ k ) = 0(x). 

k>l 


(4.1) 

(4.2) 

(4.3) 

(4.4) 


(4.6) 


Combining (2.6) and (4.2) and using (4.3) and (4.6), we obtain 

rjf(x) log x 4- £ A(m) A(») = 2x log x 4- 0(x). (4.6) 

M<« 

From this we have the following modified form of Selberg’s inequality: 

ft(x) log * 4- £ A(») ^r(^\ — 2* log * 4- 0(x). (4.7) 

\*v 

Henoe, by (4.6), 

£ A(n) log n 4- £ A(m) A(n) = 2x log * 4- 0(x). 

*<* M<l 


(4.8) 




Proof of the Prime Number Theorem 


261 


Proof that \jr(x) ~ x 

6. We now write B(x) — \jr(x) — x in (4.7). Using (4.4) we have 

B(x) log * + 2 A(n) B = 0(z). 

»<■ W 

If we replaoe n by m and z by z/n, this beoomes 


Hence 


log | — I -f- £ A(?n) It 
“ ■»<«/* 




log z |u(z) log z+ 2 A(n) B (-V 
«<* \»/) 

-*/(«> i*© : 

= 0(z log x ) + o(z £ ^= 0(z logz) 

iZ(z) log* x = — £ B (-) A(») log n 4- 2 R (--\ A(m) A(») 4- O(zlogz), 
■<< W «m<c \mn. 


and this reduces to 


whanoe 

where 

and, by (4.8), 


|H(z)|log*z< 2 o„ | H I-1 + 0(x log z). 


»<• 


( 6 . 1 ) 


a n = A(n) log n 4- £ A(d) A(fc) 




j4(z) = 2 o w = 2z log x + 0(x). 




(«. 2 ) 


We now approximate to the sum on the right-hand side of (6.1) by 
means of an integral. If y > y' > 0, we have 

11 m | - |Btf)11 < | B(y) - B^) t <m - M) + (V~v') 

=f(y) (#.3) 

where 

Q<f(y) = ft(y)+y=‘0(y) 
by (4. S) and /(y) increases steadily with y. Also 

L n i'6) -(^)l-.?/6)-(ra)-<»•*> 


•<« 


Now let us write 


= Ox = 0, b n *= o n — 2 f log t dt (a > 1) 
J »-l 



202 


E. M. Wright, The Elementary 


“ d B(x)= £ 6 „ = A(x )-2 flogfdt —0(«) (6.5) 

«<« J 1 

by ( 6 . 2 ). Henoe 


£ b n Jjr) = £ {£(»)- £(»-l)} | R 

*<c VV «<c 


\n+l 


-■£,HI*©H*Crii)l} +Ji 

- 0 U.’1 / C)- / (.'tt)I) +0W 


+j »I*(h 


=3 0(x log x) 

by (6.3), (6.4) and (6.5). Henoe 


So B ij(*V = 2 £ log t dt + 0(x log x). 

»<• V&/I W *— 1 


Again, for n > 2 , 


* 6)1 /I., 10 * * ^ - /!_, I ^ (f)i lo * *•“ i 


and so 

£ . 
K"<* 


- *» kU-j -'©!)+0(01 k *‘ *) 


= 0(x log x) 

by (5.4). 

By (5.1), ( 6 . 6 ) and (6.7), we have 


| R(x) | log* x < 2 J J R I* j j log t dt + 0{x log x). (5.6) 

We write 

V(i) = e~lR(et) = - 1 = e-«( £ A(h)) - 1. (5.0) 

•<«* 

If we pnt x a e* and t = *e~*, we have 

/>(?) logl* = x J*| F (f )|(|- y)dii = x J*I F (7)I 

—Mi' F (’> i <*’ 



Proof of the Prime Number Theorem 


263 


and (6.8) beoomes 

I V(i) I P < 2 | F( r,) | d V + 0(Q. (6.10) 

6. We have 



=* E A(n) — *) = log x + 0(1). 
«<« \» */ 


by (4.3) and (4.4). Putting t = e», x = e*, we find that 


Hence there is a fixed number A such that, for all £* > > 0, 

F(»;)d7 I < A. 


u: 


( 6 . 1 ) 


Next let y >y 0 . Writing (4.0) with * = y and again with x = y 0 
and subtracting, we obtain 


o < My) lo g y—My 0 ) i«g y 0 

= 2y log y — 2y 0 log y 0 — E A(m) A(n) + 0(y) 

< 2(y log y - y 0 log y 0 ) + 0(y), 
since A(») ^ 0 for all n. Hence* 


I R(y) log y - B(y 0 ) log y 0 1 < y log y - y 0 log y 0 + 0(y). (6.2) 


Now let us write y = e», y 0 =* e*o, t = q — t) q > 0 and suppose that 
F(7 o) = 0 - We have from (0.2) 

| F(„) | «1 - + 0(‘) - 1 + + «r+0Q, 

since 


rc _T <l, 1 —e~ T <t. 

Hence 

I V(y)l < y — Vo + Oiy- 1 ), (6.3) 

provided y> y 0 an( l F(^ 0 ) = 0. 


7. Our object is to prove that F(£)-> 0 as |-*-oo. By (4.3) and (6.0), 
F({) is bounded and so there is a number a suoh that* 
_| F(g)l<« + o (l)._ (7.1) 

* We use o(l) to denote a function of $ (or t) or 1J as the case may be) which 
tends to sera as ?-* eo and which is hounded for all 5 > 0. Also o(5) *» 5<>(1), 
o(5») « 5*o(l). Hence 

/**o(l)di) * o(5), f o(i))dij =■ o(5 # ). 

J o Jo 



264 


E. M. Wright, The Elementary 


If a «= 0 , we have nothing to prove. Henoe let us suppose that a > 0. 
We take £ > 0 and write 


a 8 a* + 4 A 

r “ oL ” 


From the definition (5.0) of F(£), it is dear that F(£) decreases 
continuously as £ increases exoept at points of discontinuity {vie., those 
at which £ = log p m ) at which V{£) increases. Henoe F(£) can only 
change from positive to negative by passing through a zero. Hence, 
for a given £ > 0, either there is an ij Q such that 

£<7o<£ + /f-«, V(Vo) = 0 
or V(rf) cannot change sign more than onoe in the interval 

£<?<£ + /?-«. (7.2) 

In the former oase we have, by (6.3) and (7. 1 ), 

J t+0 /•«!. /•<!.+» rc+fi 

I V{v)\^v= + + I V(v)\dv 

C J £ J ifs J fft+o 

<(7o~£)* + f Tdr + (£+/? — Vo— *)a + o(l) 

= *fi-\a> + o(l) = x'/3 + o{l), 

where 

e '““( , - 5 ')“*( 1 - 5 Sr+-«a)- (7 ' 5> 


In the latter case, either V(v) does not change sign in the interval 
(7.2), so that 


rt+fi-a rc+fi-m 

J c \V{V)\dV J { V(n)d V 


<A, 


or V(ij) changes sign just once in the interval, say at 7 = Vv w ^ en 

J t+e-a I ro, I ] rt+e-a 

|F(?)|*¥-|J>(?)«*f| + |J V{v)dv < 2 A. 

In either case, by (7.1), 

ft+fi. fC+* 

|F(*)d*<2.4 + \V{v)\dv<2A + a* + o{l)-*"ft + o{l), 

J c J C+fi-a 


where 








Ham, if 

/ * Jf-1 fmp+fl /•£ 

\V(V)\ dv= r \V(v)\dv+\ !F(i7)|rf* 

0 M-oJ mfi J 110 

< a'/SM + o(M) + 0(1) = a'i + o<£). 



Proof of the Prime Number Theorem 


205 


Henoe, by (5.10), 
and so 


|F({)|<«' + o(l). 


(7.4) 


8 . We can oomplete the proof in two alternative ways. We may 
take — 

ct= lim | P(£)|. 

If at> 0 , it follows from (7.4) that at ^at' and from (7.3) that a' <a. 
This is a contradiction and so a = 0 , that is V (£)->0 and 

ijr(x)~x. ( 8 . 1 ) 


If we wish, we oan avoid an indireot proof and the idea of an upper 
limit as follows. We know that (7. 1 ) is true when a is equal to some a 
and we oan take A > 3a*. We write a n = ^J(5A/n) t so that a 15 > a and 
(7.1) is true for a=a 15 . But, if n > 15 and a = a H <a u = ^(A/3) t 
we have 


<*'=<*»( i 


3« n * + *A 



&A/ S » \ » / 


<<*« + !• 


Hence (7.1) is true for a = 04 ,, 04 ,, a u , .... Sinoe a n -*0 as »-m», 
it follows that P (£)->0 as £-*■ 00 . 


Proof or the Prime Number Theorem 
9. We write aB usual 

ir(x) =2 1 = 0(x). 

We have 


K' 


l !r(x) — £ A(n) = £ log p < log * £ 1 
?•<* r*<* 


and so 


= log a; £ n(x lla ) = 7 r(x) log * 4- 0(x* log* x) 

•« osa 


Next, if X = x/log x. 


n(x) log x > * + 0 ( 2 ). 


^(x) > £ log p ^ (»r(x) — »r(X)} log X, 


7r(x)<»r(X) + I i (ie i = OW+ 


0(x) 

log X ’ log X 


-<,)• 





266 


E. M. Wright, The. Elementary 


Henoe 


n(x) log x < rr(X) log x + ^ogX X ~ x ' 


n(x) 


x 

~log* 


aa x->ao, which is the Prime Number Theorem. 


Remarks 

10. It is enough for our purpose if the error term in (1.2) is replaced 
by 0(x~ *) for some fixed t > 0. In (4.1) we have then an error of 
order 


£ 

«> 



i-* 



< 2 

*-i 


£ 



1-1 



+ 7 


I 

f 


(ife + y)e*«-''xe 1 -‘=«rE (Jfc + y) e-“ = 0(x). 

*-i *-i 

If the error term in (1.2) is replaced by o(l/log x), however, we need to 
modify our argument somewhat tediously. In general, 0(x) is replaced 
by o(x log x) and, for example, 0(£) in (6.10) becomes o(£*). This auffioes 
to obtain the prime number theorem, of oouroe. 

11. Selberg’s inequality in its original form is 

&(x) log x + £ £ log p = 2x log x + 0(x), (II. 1) 

where 

#(X) E log p. 

This can be deduced simply from (4.7). 

12. The use of the elements of the integral calculus in §§ 6-7 makes 
my presentation of the proof less “elementary” (in the teohnioal logical 
sense) than Selberg’s (1949), whioh uses the theory of series only. 
Selberg’s proof could be applied to my (4.7) in just the same way as to 
(11.1). But the use of the calculus makes unnecessary certain com¬ 
plexities of detail and shows more clearly the fundamental ideas (which 
are, of course, essentially Selberg’s). 


REFERENCES TO LITERATURE 

Em>6s, P., 1949. “On a New Method in Elementary Number Theory which 
Leads to an Elementary Proof of the Prime Number Theorem ", Proe. Nat. 
Acad. Set., xxxv, 874-884. 




Proof of the Prime Number Theorem 


207 


Hadamakd, J., 1806. 11 Sur la distribution des s4roe de la fonotdon £(*) et ees 
oons6quetioes aritfamOtigosa ”, Butt. Boc. Math, France , uov, 100-220. 

Sblhbbg, A., 1040. “ An Elementary Proof of the Prime-number Theorem ”, 

Annals of Math, p I*, 806-313. 

db u Vat.tAw Poussin, Ch., 1806. “ Recherohee analytiquee sur la thforie des 

nombres premiers ”, Ann, Soc . Sc. Bruxelles, xn b, 183-266. 

Vajt dib Oobfut, J. Q., 1040. “ Demonstration 616mentaire du theorem© sur la 

distribution des nombres premiers ”, Scriptum 1 (Mathem&tisoh Centrum, 
Amsterdam), 1-32. 


[Issued separately May 30, 1062) 



268 


Q. Klein, A Generalization of 


XIX.—A Generalization of the Classical Random-walk problem, 
and a Simple Model of Brownian Motion Based Thereon. 
By G. Klein, Ph.D., B irk beck College, London, now at 
University Libre de Bruxelles*. Communicated by Dr. R. 
FOrth. 

(MS. received September 18, 1951. Read December 3. 1951) 

Synopsis 

Suggested by the analogy between the classical one-dimensional random-walk 
and the approximate (diffusion) theory of Brownian motion, a generalisation of 
the random-walk is proposed to serve as a model for the more accurate description 
of the phenomenon. Using the methods of the calculus of Unite differences, some 
general results are obtained concerning averages baaed on a time-varying bivariate 
discrete probability distribution in which the variates stand in the particular 
relation of “ position ” and “ velocity.” These are applied to the special cases 
of Brownian motion from initial thermal equilibrium, and from arbitrary initial 
kinetic energy. In the latter case the model describes accurately quantised 
Brownian motion of two energy states, one of zero energy. 


Introduction 

The classical random-walk problem, proposed by Pearson, has been 
thoroughly treated by Rayleigh. Independently, Smoluohowski used 
equivalent mathematical methods to desoribe Brownian motion, and this 
phenomenon has been linked with the random-walk problem ever sinoe. 
The latest contribution has oome from Kao, who generalized the problem 
so as to make it correspond to Brownian motion of an elastioally bound 
particle, and to inolude boundary problems. The connection and treat¬ 
ment given so far hold however only for the “approximate” theory of 
Brownian motion, that is, for times t > > 1/y, where y is the resistance 
coefficient per unit mass of the medium. Under these conditions for a 
“free” partiole, Einstein’s formula holds, 


— IcT 
z* = 2 — t, 
my ’ 


( 1 ) 


whioh is derived from statistical mechanics. Here k is Boltzmann's 
oonstant, T the absolute temperature of the medium, m the mass of the 
Brownian particle, and z* its average squared displacement during an 
interval, t. 

A complete and more aoourate description of Brownian motion can 
follow only from a detailed kinetio theory, but if mean squared displace¬ 
ment formulae only are required, these can be deduoed by various simpler 

* Most of the contents of this paper forms part of the author’s Ph.D. Thesis, 
University of London, 1951. 



the Random-ioalk Problem 


209 


methods as only the principle of equipartition of energy is involved 
(e.p., by an acourate integration of Langevin’s pseudo-equation of motion 
of a Brownian particle as given by Omstein), and it may be taken as 
established that the formulae replacing (1) are 

? = 2 — 21 t — 2- T (I — e-r 1 ), (2) 

my my* v h ' # 

for conservative diffusion, that is, for a particle initially already in 
thermal equilibrium with the medium, and 

_ h ji up up T7~a 

** = 2 my * + my * (1 - * ^ (I - ^ + y*<* ^ < 3 > 

for one starting with arbitrary kinetic energy |mu^* (Chandrasekhar, 
1943). 

In order not to sever the connection with the random-walk problem 
for the “accurate” theory, the simplicity of equal discrete random 
displacements had to give way to the assumption of an appropriate 
continuous probability distribution for the magnitude of these displace¬ 
ments, and the problem, like the classical random-walk, becomes a 
special case of the theory of Markoff processes (Chandrasekhar, 1043). 

Now it will be shown that the “acourate” results, (2) and (3), can 
also be obtained from a model more akin to the classical random-walk 
if, following Furth, one introduces a certain “persistence,” suggested by 
the physical situation. Mathematically this is equivalent to a discrete 
velooity distribution of only three variates, instead of the continuous 
one just mentioned. Although in some respects such a model muBt 
appear artificial it is of intrinsic interest, and it seems to be the simplest 
that is mathematically precise and at the same time shows the essential 
features of the actual physical phenomenon. 

General Theory 

Let /(», m; N) be a discrete bivariate distribution function in n and 
m which varies in time, N signifying the time instant considered. If p’s 
are probabilities of transition during the interval, the change in the 
distribution can be described completely by equations 

/(n, m; N+ 1 ) = 2 #/( w> * m '> N). (4) 

«* w 1 

Now let the variates n and m be related in a special manner, such that 

n = »' -f - m '- ( 6 ) 

This can be interpreted thus: If n denotes the position of a particle, 
/(», m; N) is the probability that a particle will be found in the n 0 * 
position at the instant and that it will jump m positions within the 



270 


O. Klein, A Generalization of 


following interval. Hence m may be said to correspond to velocity. 
Equations (4) become 

/(», m; N+l)= £?*_«< ; m -, m; df (» - m', m ’; N). (8) 

m s 

Using the Kroneoker symbol and the “ step-up ” operators of the o&loulus 
of finite differences, 

E n = 1 + A., E N =1 + (7) 

these may be written 

2 (Pn - m '; «'. m; S E *~ m ' - *mm' E„)f{n, m' / N) = 0. (8) 

M' 

If the transition probabilities p are independent of position and 
time—for a Brownian particle this would mean it is “free” apart from 
the random impacts from the medium molecules, or it is under a constant 
external force—that is, if p m ^ m is the probability of a partiole with 
“ velocity ” m' acquiring “ velocity ” m after a single interval, the follow¬ 
ing formal theory is possible. Equations (8), which become 

2 ( p m -. „ E n ~ - S mm . E s ) f (», ro'; N) = 0, (9) 

are a system of linear partial difference equations with constant coeffi¬ 
cients in 

.... f(n, m - I; N), f(n, m; N), f(n, l; N), . . . ( 10 ) 

As in the theory of systems of linear differential equations with constant 
coefficients, it follows that eaoh of the f(n, m; N) satisfies the same 
difference equation, viz., the one constructed with the determinant of 
the operational coefficients of the system of equations (9), 

\Pi,fE n ~ i — Si'jE s \f(n, m; N)=td, (11) 

or, by a slight transformation, 

D(p; E n ,E N )f(n,m; N) = 0, (12) 

where 

E(p; E n , E n ) = | Pi,) - E n < E s |. (13) 

Any linear combination of the /(n, m ; N) will also satisfy the same 
difference equation; in particular, for the number density, 

g{n; N)= 2/(n, ro; N), (14) 

in 

one has 

D(p ; E n , Ed) g(n ; N) = 0. (15) 

Sinoe in Brownian motion one is mainly interested in the moments 
*v> ft*#* ©to., it is advantageous to define the moment generating funotion 



the Random-walk Problem 


271 


M(a; N) = 'Le an g(n; N) (18) 

H 

= l + an Jf 41a , » , tf + • • (17) 

Multiplying (16) by exp (an) on the left, expanding the determinant in 
powers of E n , passing exp (an) through the operator so as to make it a 
factor of g(n; N ) with appropriate adjustment, summing for all n, and 
remembering that E n operating on a function not containing n may be 
replaced by unity, one finally obtains an ordinary difference equation for 

N), 

D(p; e~ a , E y ) Jf(a ; N) = 0 . (18) 

In principle, the number density itself can then be found from 

g{n; -W) = 2 ^ J " e - <B " M(iO; N)d6. (10) 

u 

The solution of (18) and the subsequent expansion (17), or the integration 
(19) are in general very difficult; if, however, only the first two moments 
are required, it is sufficient to expand the determinant and the moment 
generating function up to the second power of a only, and to equate 
ooefficients of a and a* in (18) to zero. This gives 

D(p ; 1, E n ) n N + D(p ; e~ *, 1)J = 0, (20) 

D(p; 1, En) n% + 2 D(p; e “*. ^,v)j # 

+ (21) 

In the special oases where the p’s are symmetrical—no external 
force—that is, where 

P— m\ — m ~ Pm\ m> ( 22 ) 

in addition to the necessary relation 

SP*'.,»= 1. (23) 

m 

one finds that the second terms in ( 20 ) and ( 21 ) vanish, so that these 
equations become 

D(p; l,E s )n tf = 0, (24) 

D(p; 1 , E s ) D(p; e— , 1)]^ # . (26) 

In the following sections only the simplest applications of oases (22) will 
be considered. 



272 


Q. Klein, A Generalization of 


Ab noted above, any linear combination of the /(», m ; N) will satisfy 
the difference equation (12). Thus the mean total energy and the 
marginal velocity distribution function h(m; N) at the instant N, 


6s = 2Xlmm a f(n,m; N), (26) 

n » 

h(m; N) = E/(n, m; N), (27) 

are easily seen to satisfy 

D(p: 1 ,E n )6 n = 0, (28) 

D(p ; 1, E y ) h(m; N) = 0. (29) 


Conservative Cask 

The simplest case to whioh the preceding theory may be applied is 
the one where the only transition probabilities are 


Pi .-1 Pt. x 

which in virtue of (22) and (23) reduce to 

p 1 -p 

1 ~P P 


(30) 


(31) 


where p is the probability of persistence of velocity, or motion. There 
is a persistence proper only if £ < p < 1, while for p = \ the case reduces 
to the classical random-walk. 

From the definition (13) one finds 


D(p; e-“, E n ) = E s *- (e a + e~‘) p E„ + 2p— 1. (32) 

Substitution into (24), (26) gives the difference equations 

(E N +l-2p)(E N ~l)n N ~0, (33) 

(■®n + 1 — 2p)(A\ v — 1) » 2 y =_= 2p, (34) 

of which the solutions are 

n N = const. + const. (2 p — 1)^, (36) 


n ■ 


——— N -(•• oonst. (2 p — 1)* + const. 
1 — P 


(36) 


With the special initial condition of a partical starting at instant N «* 0 
from position n = 0 with equal probabilities | to either direction, one 
easily calculates 


n N =0 


(37) 



the Random-walk Problem 


273 


i 2(1 ^ ~ ^) JV '” *}■ (38) 

The latter formula was first obtained by Ffirth (1020), to describe 
the apparently random motion of certain infusoria. Owing to their, if 
limited, memory these animals will in fact show a certain persistence in 
motion, and the mean square displacement law is not of the same form 
as Einstein’s formula, (1), of the “approximate,” or diffusion theory of 
Brownian motion. Only with p = —no persistence—follows the 

formula corresponding to (1), r? N = N of the classical random walk. 

Now, as Fiirth pointed out, the model really does apply to a Brownian 
particle as well. For, owing to its inertia, a particle moving in a certain 
direction in a small interval is more likely to continue in this direction 
in the following interval than to reverse, especially as the interval is 
made infinitely small. Thus, with the interval tending to zero, and the 
persistence p tending to unity, (38) gives in the limit the “accurate” 
mean squared displacement formula (2) for Brownian particles initially 
already in thermal equilibrium with the medium. (This will show also 
the divergence from the “approximate” theory in which the interval 
has to remain finite and sufficiently large for there to be no correlation 
or persistence of motion in consecutive intervals, hence there p = \ and 
the classical random-walk analogy.) Writing 

x = n£, t = Nt, p=l — £yr, (39) 

where £ and r are the standard distance and time intervals, and y is a 
constant—an inverse measure of the persistence—on proceeding to the 
limit £, r->0, one obtains from (38) 

3 = 2 1 (£/r)* (£/r)« (1 - e~r‘). (40) 

y y 

Now im(£/r)* is the standard kinetic energy of a particle in this model; 
equating it to the equipartition value \kT, and substituting into (40), one 
gets (2). 

In the present case, (31), the kinotio energy of the motion never 
changes, and in order to derive by this “ persistence " method the for¬ 
mulae for Brownian motion from rest, or arbitrary initial velocity, a leas 
simple model is required. This suggested the generalization in the 
previous and following sections. 

Before leaving this ease, it may be noted that here the moment- 
generating function itself oan be evaluated from (18) without difficulty. 
It is 

M (a; N) = A^tt) [p cosh a + V{( 1 - P)* + P* sinh* <*}]* 

+ A t (a) [p oosh a — ^/{(l —p)* 4- p* smh* a}}^, (41) 

v.u.a.— vol. um, a, 1960-51, taxi m 19 



274 


O. Klein, A Generalization of 


where A^a) and A t {a) are functions of a depending only on initial 
conditions. For the special initial conditions taken above (after (36)), 
which are equivalent to 

M(a; 0) = 1, M (a; l) = oosha, (42) 

one finds 

Jj(a) = $ £l -f cosh a 1 + ^ smh* a| *J, (43) 

^i( a ) = i J^l — cosh a |l + {jZIp ) 8in h* a | (44) 

from whioh (37) and (38) might have been derived. For the case of the 
classical random-walk, p = i, and the same initial condition, (41) re¬ 
duces to 

M(at; N) = (cosh a)- v . (46) 

Dissipative Case 

The next simple case is the special one in whioh there is only one 
standard speed, and a possibility of rest but not of outright reversal. 
Here the only surviving transition probabilities are 


*-1.-1 

*-1.0 



Po,-1 

Po.0 

Pe.i ■» 

(46) 


*1.0 




whioh because of (22) and (23) can be taken as 

p 1—p 

? id-?) •, (47) 

l—p p 

where p is the probability of persistence in motion, and q is the probability 
of persistence at rest in the standard time interval. The operational 
determinant is now 

Dip; e- a , B s ) *= — E s * + (2p oosh a -f q) E„* 

+ {d— P — ? — P?) <» 8 h a — p*)E N 
+ J»(P + ?-l)- (48) 

Hence substitution into (24), (26) gives 

{Ej, - p)(Ei, -f 1 -p-q)(E N - ljnjy = 0, 


(49) 



the Random-walk Problem 


275 


(E n - p)(E s + l-p-q)(Ejf - I) n* w = (1 -*0(1 - q), (60) 

with the solutions 

» w = const. + oonst. (p + q—if + const. p s , (51) 


N + conflt - + oonat - (P + 9 — 1)* 


+ const. p s , (52) 

Taking the special initial conditions 

g(0 ; 0) = 1 =/(0. - 1 ; 0) +/(0, o; 0) +/(0, 1; 0), (53) 

rn 0 = -/(0, - 1 ; 0) +/(0, 1 ; 0), (64) 

=/(0, - 1 ; 0) +/(0, 1; 0), (55) 

the solutions are found to be 


n N =*m 9 l ~[l-p s ), 
(l+p)(l-q) 


( 58 ) 


._V- I r>\- 1 J_ v , 2 P + ?— 1 _ v 

N ~{l-p)P-p-q) iy -*-(2-p-q)\l-q) 

'2JTg - ^o) {1 - (P + 7 - 1H 

“ W-m - g) \t- i p~ w *°} {1 ~ ^ (67) 

From (20), with the same initial conditions, one finds in particular 

A(0; N) = h(0; 0) + L-Lr£- - A<0; 0)}{1 - (p + q- if}, (68) 

(i — p — q I 

so that 


h(0; oo) = (l -p)/(2 -p-?); 


(59) 


thus eventually the number of particles in motion and the number of 
particles at rest will tend, irrespective of initial conditions, to the constant 
ratio 

(l-q)l(l-p). (80) 

It can similarly be shown that 

m s = m 0 p s , (81) 

consistent with (56); and from (28) that 

/,){!-{p + q~ If}. 


(82) 



276 


0. Klein, A Generalization of 


Here 




from which it can be deduoed that 


( 63 ) 


&n + i —— (2 — p — q){&cp — & a)* (64) 

Taking 

z = n£ t i = Nr ; p = 1 — y 10 r, ? = 1 — y tt r, (65) 

where y 10 and y w are clearly transition probabilities per unit time, 
relations (50) and (61) become in the limit 

* = «o (1 - e- M/yio- (8®) 

u = u 0 e-?iJ, (67) 


for the dependence on time of the position and velocity of the oentre of 
the distribution. The last equation gives 


d - 

= -7io«. 


( 68 ) 


so that the motion of the centre of the distribution is analogous to that 
of a body moving in a resisting medium, and one can consider y 10 as a 
resistance oonstanoe per unit mass. 

The ratio of particles “in motion” to those “at rest” is by (60) 
and (66) 

Vnl7io, (89) 

and the relation corresponding to (63) is 


E.= —^* m(£/r)*. (70) 

y« t y M 

Using this expression, (67) becomes in the limit 


?- 2 2E * <4- - 4Efl0 —-{1 _ e-<rn + r,^} 

m y 10 m (y w -(- y 10 ) y 01 


4 (y 0 i -I- yio ) 
mrw'roi 


( 1 -e-W) 


+“’• " kGZ+tS “ -‘“'"*'4 ™ 

Thus unlike formula (3) which contains three parameters (kTjm, y, u* 0 ), 
the formula arrived at contains four parameters. It beoomee formally 
identical with (3) if one puts y 10 =y 01 = y ,—number of partioles in 
motion and those at rest tending to equality—and Eq» = \kT, the equi- 
partition value. In view of the following interpretation, due to Dr. 
Fhrth, this is, however, not permissible. 



the Random-wdUc Problem 


277 


Formula (71) is a displacement formula for a quantized motion of two 
energy states, one of zero energy, and the other degenerate with velocities 
± lim (£/r), with y 10 and y ra as transition probabilities. The probabilities 
of these two states in stationary equilibrium at temperature T are 
determined by the well-known relation from statistical mechanics 

W{e,) = g t 6-<J kT , (72) 

where g t is the statistical weight of the energy Btate e,. In the present 
case where e is restricted to e 0 = 0 and e lt the corresponding statistical 
weights are equal since transitions can occur only between e 0 and «j. 
but not between the two (degenerate) states belonging to Hence 


7io _ W(e 0 ) _ , kT 

7oi ^( e i) 


(73) 


For the limiting case of very high temperature it dearly follows that 


7io = 7oi = y. (say). ( 74 > 

or p = q. Thus the number of parameters is reduced to three, and it iB 
seen that (71) now becomes formally identical with (3), viz., 

?= + 2 JS (l ~ e ~ ,r,) ~ 4 L% (1 “ (75) 

my my" my* y* 

This is to be expected, as the “classical" results are valid only in the 
limiting case of very high temperature. However, E m will not be 
identical with the equipartition value \kT, since the equipartition theorem 
holds only for continuous energy distributions whereas the present 
model is strictly restricted to the energies 0 and 
From (64), proceeding to the limit, it follows that 


jE=-(y 01 +y 10 )(E-E x ), (76) 

= — 2y (E — Eoo), (77) 

so that the model reflects faithfully an interesting property of Brownian 
motion: the energy follows the usual decay law, or, considering particles 
as having an absolute kin etic temperature proportional to their random 
energy, (77) may be interpreted thus: the law according to which the 
medium supplies energy to the particles is analogous to Newton’s law 
of cooling (qf. Prigogine, 1047). 


Acknowledgment 

The author wishes to thank Dr. R. Ftirth, of Birkbeok College, for 
introducing him to the study of Brownian motion, and for many con¬ 
structive suggestions which contributed to this paper. 



278 


O. Klein, A Generalization of 


Appendix 

The following results oonoeming the classical one-dimensional random- 
walk may be of interest. The moment-generating function of the 
distribution in n has been found as, (45), 


M (n) (a ; N) = (oosh a) N . 


(78) 


Hence, from (19) and (17) follow the well-known distribution itself and 
the moments 


g{n; 


N) — .vQ.V - n)lt 


(79) 



_(2r)!_tf! 

j-k- . . .)!»TjNfclT. . X 


war- 

fi*jy = N, n* N = A (3JV — 2), 


(80) 

(81) 


n* lV = N(15N* — 30 AT 16), (82) 

»% = N(l05N a — 420Al a + 688 AT — 272). (83) 


In view of Einstein’s formula (1) it is of some interest to consider 
the distribution, not of n but of n*. From (78) it can be shown that 
the moment-generating function of this latter distribution is 


OS 

*) = Jf n j N)dx, 

- CO 

= J e- (cosh Xyj2x) x dx, 


and hence, or better, directly from (81)-(83), 


n' = N, (n'-n*)* = 2N(N -2), 


^fy = SN(N-l)(N-2), 


(n*_^*)» = 4 N(N - 1)(15 JV* - 83 Jf + 88). 
Hence the disorete probability distribution in n* has 

f?=r mean — N —N, 

(n 9 _"*)* = varianoe = 2N{N — 1) 2 AT*, 


(84) 

(85) 


( 88 ) 

(87) 

( 88 ) 

(89) 

(90) 



the Random-walk Problem 


279 


«,_(»•-»■)• _ eke .- 2 Ji N ~ 2 . 

{(n*-» 1 ) , }® / ‘ y/N{N-l) 


2V2, 


(n« —n*) 4 , . . 15N*- 03J\T+88 1If 

«4 = -- = kurtosis = ---T- - -► 15. 

{(»*-»•)»}* ^-1) 


(91) 


(92) 


Thus the distribution of n % is leptocurtic and of the x* type, or Pearson 
Type III. 


REFERENCES TO LITERATURE 

Chandrasekhar, S. v 1043. "Stochastic Problems in Physics and Astronomy ”, 
Rev. Mod . Phys., xv, 1-89. 

Einstein, A., 1905. “ Uber die von. der molekularkinetisohen Theorie der Wflrme 

geforderte Bewegung von in ruhenden Flussigkeiten suspendierten Teilchen ”, 
Ann. der Phys., xvn, 549-500. 

FObtu, R., 1920. 11 Die Brownsche Bowegung bei Beriioksichtigung einer Per- 

sistenz der Bewegungsrichtung. Mit An wend ungen auf die Bewegung lebender 
Infusorien ", Zeits . /. Phys., u, 244-250. 

Kao, M., 1945. “ Random Walk in the Presence of Absorbing Barriers ”, Ann . 

Math. Statistics* xvi, 62-07. 

-, 1947. “ Random Walk and the Theory of Brownian Motion ", Amer. Math. 

Mon., liv, 309-391. 

Obnstrin, L. S., 1918. 11 On the Brownian Motion ", Proc. Kon . Ned. A had. 

Wet. Amsterdam , xxi, 90-108. 

Phioogine, I., 1947. Etude Thermodynamique dea Phlnomlnes irreversible*, 
Dunod-Paris. 

Rayleigh, Lord, 1919. “ On the Problem of Random Vibrations, and of Random 

Flight in One, Two, or Throe Dimensions ", Phil. Mag., xxxvn, 321-347. 

-, Scientific Papers. 

Smoluohowski, M. v., 1900. “ Zur kinetischen Theorie der Brownsohen Mole- 

kularbewegung und der Suspensionen ", Ann. der Phys., xxi, 750-780. 

-, 1923. Ostwalds Klassiker, No. 207, Leipzig. 


(Issued separately , May 30, 1962) 



280 


E. H. Uoyd , On (he Estimation 


XX.— On the Estimation of Variance and Covariance. By E. H. 

Lloyd, Imperial College, London. Communicated by Professor 
H. Levy. 

(MS. received September 3, 1951. Revised MS. received November 28, 1951. 

Read January 7, 1952) 

Synopsis 

Suppose we have a number of independent pairs of observations (X t , Y t ) on 
two correlated variates (X, Y), which have constant variances and covariance, 
and whose expected values are of known linear form, with unknown coefficients: 
say £ Pipit £ gtjbj respectively. Tho p t j and tho qq are known, the oj and the bf 
i i 

are unknown. The paper discusses the estimation of the coefficients, and of the 
variances and the covariance, and evaluates the sampling variances of the esti¬ 
mates. The argument is entirely flee of distributional assumptions. 


1. Introduction 

The subject of this paper was suggested by a problem first discussed 
by Aitkon (1948). That problem is here generalized and fireed from 
distributional assumptions. 

Suppose we have n independent pairs of observations (X 4 , YJ, with 
vorianoes Oj*, <r t * respectively, covariance pa x o t , and expectations 
which are linear functions of k (^ n) constant coefficients. The problem 
is to estimate these coefficients, and the variances and the covariance of 
X and Y, and to evaluate the sampling variances of these estimates. 

The expectations have the form 

«f(X.) = S «f(7.) = S qj> t (a=l,2 .n). (1.1) 

/- i t- 1 

Hero the a t and the b t an the coefficients to be estimated, and the p M 
and the q M are any set of numbers determined by the design of the 
experiment. We assume the matrices [pj, [</J to be of full rank k. 

It will be Bhown that subject to certain (stated) conditions on the 
p M and the q M , the best unbiased linear estimates of the a r and the 6 r 
are unaffected by ignorance of p. 

When one estimates the a f and the b r , unbiased quadratic estimates 
of (Tj*, cr t * and ptr l <r i may also be obtained as by-products. Formulae 
are derived giving the sampling variances of these estimates in terms of 
the seoond and fourth central moments of the original observations. 

The discussion is conducted entirely in distribution-free terms; in 
particular, the results do not depend on normality. 



of Variance and Covariance 


281 


2. Notation 

Variates and observations on variates are denoted by upper oase 
latin letters. Symbols in bold type represent matrices, including 
vectors, a vector being understood as a column-vector. A prime, as 
usual, denotes transposition. The standard symbol I represents the unit 
matrix (this being a slight but unambiguous departure from our rules). 

The set of observations are represented by the vector X, and the 
set of Y { ’b by Y. The a i and the b t are likewise written as vectors a, b. 
The (n x k) matrix of the p tj is denoted by p, and the (n x k) matrix 
of the q i} by q. 

We now write (1.1) in matrix form: 

<?(X) = pa, <f(Y) = qb. (2.1) 

We assume that the variances <r 1 t , <r,* of the X { , Y t respectively, and 
their covariance p<r^(r x , are constant. The variance matrices and the 
covariance matrix of the observations are then 


where 

and 


where 


V(X) = rr^I, V(Y) = (r 9 *I, 

^(X, Y) = poyr, I, (2.2) 

<f(X, Y) = <f(XY') - d“(X) . <f(Y') 

V(X) = V(X, X). 

The two matrix equations (2.1) can be combined in the form 

«f(Z) = rc (2.3) 


-(?)• '-C U -(:> 

L is 

= / ^i 1 ! P* i®*! \ 

\ pop |I er,*! /• 


The varianoe matrix of Z is 
v 


(2.4) 


3. Estimation of the Linear Coefficients 


If we consider now the estimation of a and b, the Least Squares 
Theorem gives the best unbiased linear estimate as 

c = (r'v-> r)- 1 r'v- 1 Z, (3.1) 

with varianoe matrix 

0(c) = (r'v- l r)~ l . (3.2) 

If p is unknown, however, an estimate involving v will not usually 
be of any practical use. We must then do the best we can, whioh is 
to use the two equations (2.1) separately, obtaining the estimates 



282 


B. H. Lloyd, On the Estimation 


a = (p'p) - 1 p'X, b = (q'q) _ 1 q'Y. (3.3) 

We shall refer to this as the “ separate ” procedure. It will still be 
worth while to investigate the “ joint ” procedure whioh leads to (3.1), 
however, sinoe it turns out that in certain circumstances the two pro¬ 
cedures lead to the same estimates. 


3a. The Special Case of p = q 

Before investigating the general question, we consider the special 
case examined by Aitken, namely the ease where p = q. We obtain 
the interesting result that the “ separate ” estimates here ooincide with 
the " joint ” estimates, so that, in this cose at least, it appears that we 
lose nothing by not knowing p. 

To prove this we develop (3.1), which becomes 

c = (e'v- 1 a)- 1 s'v- 1 Z, 


where a = / p 0 \ 

U p). 

The submatrices into which v, given by (2.4), is partitioned are scalar 
and thuB commutative. Hence we have at once 


Thus 


_1_/ - P* i0il \ 

—P*)\ ~ P v i <r i I )’ 

_ _ _1_/ ^.*P'P -pOttTJp'p 

<rj*<r a *(I — p 1 ) \ — p(T x (T j>'p cr 1 *p'p 


(3.4) 


The submatrices into which this matrix iB partitioned are again com¬ 
mutative. The inverse is thus 


(s'v - 1 a )- 1 = 

whence, finally, 


/ Mp'p)- 1 

\ P<Ti<T t ( P'P )- 1 


^.*(p'p)- 1 y 


/ (P'P)- 1 P'X\ 

~ i (p'p)- 1 p'y y 


(P'P )" 1 PT 

These results are identical with the " separate ” estimates (3.3). 


(3.5) 

(3.6) 


3b. The General Case : p ^ q 


We now consider whether results similar to the above can be obtained 
when p and q are no longer equal. The matrix r'v *r of (3. 1 ) now has 
the form 


tr 1 *a , 8 , (l — p*) r'v -1 r 



<T|«P'P 

- pa i<r *q'p 


- P<r Wq 
^iVq 


(3.7) 


The aubmatrices here being no longer commutative, inversion of the 



of Variance and Covariance 


283 


matrix must be carried out from first principles. We therefore rewrite 
(3.1) in the form 

(r'y-i r)c — r'v -1 Z. 

Using (3.7), this beoomee 

tr^'pa — po^o-jj'qb = tr^p'X — pv^jp'Y 
— P^i^q'pa + <t 1 *q'qb = — pa x a t q'X + tr^'Y. 

A A 

Solving these two matrix equations for a and b we obtain 
[p'(I - p*q(q'q ) _1 q')p]a = p'[I - /^(q'q ) -1 q']X 

- p fe) p'[I - q(q'q)- ‘q'JY (3. 8 a) 

and 

[q'(I - p*p(p'p)- *p')q]b = q'[I - p s p(p'p)“ ^'JY 

- p M q'[I - p(p'p ) -1 p']X. (3. 8b) 

These results are to be contrasted with the “ separate ” estimates (3.3). 
It will be seen that the “ separate ” estimates will coincide with these 
“ joint ” estimates if and only if the coefficient of Y in (3.8a) and of 
X in (3.8b) both vanish; i.e., if 


p'[i - q(q'q )' 1 q'] = q'[i - p(p'p) -1 p' 1 = o. (3. e) 

These conditions may be written as 

q(q'q ) -1 q'p = p. p(p'p) 1 p'q = q- (3. io) 

These two relations are in fact equivalent. Premultiplying the first, 
for example, by p', we obtain 


p'q(q'q ) -1 q'p - p'p; 


now the matrix p'p is non-singular, and since p'q and its transpose q'p 
are square matrices it follows that p'q is itself non-singular. Thus 


so that 


(q'q ) -1 q'p = (p'q) - l p'p. 
q'p(p'p ) -1 p'q = q'q- 


If we now postmultiply the first relation (3.10) by (p'p) -1 p'q 
we obtain 


p(p'p ) -1 p'q = q(q'q ) -1 q'p(p'p ) -1 p'q 


= q(q'q)- I q'q = q 

by (3.1), and this is the seoond of the relations* (3.10). 

* This proof, whioh is muoh better than my original proof, is due to the referees. 



284 


E. H. Lloyd, On the Estimation 


The two relations (3.10) are thus equivalent, and either gives a 
necessary and sufficient condition for the coincidence of the two kinds 
of estimate. 


4. Estimation of Variance and Covariance 

Aitken has shown that in the case where p = q a bilinear form in 
the residuals of X and Y furnishes an unbiased estimate of the covarianoe 
/MTjcr,; and has extended this to some special cases of p ^ q where 
p and q are suitably related to one another. His derivations assume 
the residuals to be normally distributed. 

However, these results can be established without appealing to 
normalit}', and without requiring any special relationship between 
p and q. This is very satisfying in that it preserves the distribution-free 
nature of Least Squares estimation. 

We assume, as before, that 


<f(X) = pa, <f(Y) = qb, 


and we estimate a and b by the “ separate ” procedure (3.3). 
obtain the residual vectors: 


where 


We note that 


E = X — pa = mX, 

F = Y - qb = nY, 

m = I — p(p'p) 1 p' = m' = m\ 
n = I — q(q'q) -1 q' = n' = n 2 . 

mp = nq = 0, 


We then 


(4.1) 

(4.2) 


whence, by (2.1), E and F have zero expectations, 
matrices are then 


E) - «?(EE') = <7-j 2 ni, 


V(F) <f(FF) = <r,*n. 


Their variance 
(4.3) 


4a. Estimation of the Variances 

We consider first the estimation of oj from the squared residuals. 
The expectations of the individual squared residuals are the diagonal 
elements of the variance matrix (4.3); hence the expectation of the sum 
E'E of squared residuals is the sum of these diagonal elements, the 
trace of the variance matrix.* Thus 

<f(E'E) = <r, 2 tr(m). (4.4) 

* For this remark, whioh greatly simplifies my original derivation, I am indebted 
to the referees. 



of Variance and Covariance 


285 


To evaluate this we note that 


tr(X|») = tr(|iX) 

for any matrioes X, |i for which the two products Xp and pX both exist. 
Thus 

Mp(p'p)- 1 P'} = M(P'P ) -1 p'p} 

= tr(I fc ) = k, 

where I* denotes the unit matrix of order k. It follows that 

tr(m) = tr{I„ - p(p'p)~ V} 

= n — k. 

Thu b 

rf(E'E) = (» - k)<rS, 

and so 

o-j* = (E'E)/(» - k) (4.6) 

is an unbiased estimate of <r 1 s . Similarly 

;,* = (F'F )l(n-k) 


is an unbiased estimate of <r t *. 

These results are of course familiar. The present derivation is given 
however since it suggests a method of estimating the covariance ptr l <r t . 


4b. Estimate of Covariance 

We consider now the products of residuals £ 1 F 1 . The expectation of 
such a product is a diagonal element of the covariance matrix tf(E, F). 
The expectation of the sum E'F of products of residuals is therefore the 
trace of this matrix.* 

Now 

^(E, F) = V(mX, nY) = m . tf(X. Y). n' 

= per i<7 jinn, 


by (2.2). Taking the trace of this we therefore have 


^(E'F) = (ht x <t x tr(mn), 

and our unbiased estimate of the covariance is 

(4.6) 

(po-^,) = (E'F)/tr(mn). 

(4.7) 


In general this last expression cannot be further simplified since 
there is no simple expansion for the trace of a product. In any specific 
application, of course, tr(mn) can easily be computed. 

* For this remark, which greatly simplifies my original derivation, I am indebted 
to the referees. 



286 


E. H. Lloyd, On the Estimation 


In the special case where p is equal to q, and, more generally, when 
conditions (3.10) hold, we can however find the trace explicitly, sinoe 
then 

m = n, 

whence 

tr(mn) = tr(m a ) = tr(m) = » — /,*. 

Under these conditions therefore our estimate is 

- (E'F)/(n - k). (4.8) 

This result was obtained by Aitken, for normally distributed residuals. 

5. Sampling Variance of the Estimated Variances 
and Covariance 

Aitken has shown that, in the case of normally correlated variatee, 
with p = q, the estimate 

(p(r x <r t ) = (E'F)/(» - k) 

of the covariance has sampling variance given by 

V(po 1 (r t ) = (1 + p*)<r | , <r,*/(» - k). (6.1) 

V 

This may be compared with the sampling varianoe of <r a obtained from 
the sum of squared residuals in the well-known univariate normal case: 

V(o») = 2<r*[(n - le). ( 5 . 2 ) 

It is important for the satisfactory completion of the Least-Squares 
Theorem that both these results (5.1) and (6.2) be derived by distribution- 
free arguments. 


5a. Sampling Variance of the Estimated Variance 

v 

The estimate o^* given by (4.5) may be written 
(n - lc)& j* = X'mX, 

whence v __ 

(n - Jfc) a VW) = ^(X'lnX)* - (n - Jfc)*(V. 

To evaluate this expression we put 

X = U + pa, 


so that 


/(U) = 0. 


while the variance matrix and the independence properties of U are the 
same as those of X. 

Since m p = 0, it follows that 

X'm X = U'm U, 



of Variance and Covariance 287 

and in order to evaluate (5.3) we need to find the expectation of (U'm U)*, 
or 

i i+f 

Since the U i are mutually independent, with zero expectation, the only 
surviving terms in this expansion will be those in U t * and UfUf. 

Let us denote 6(U t *) by fijrf. We then have 

<f(X'm X) 1 = {2 fifing + 2 (m u m }j + 2m { fi)} or/ 

/ i+j 

= {S (fit — 3) »» rt * + 2 (m ti m j} + 2m y *)} ar t *. (6.4) 

We reduce the last term os follows. Firstly we note that 

2 (muinjj) = (2 m ti )' = {tr(m)} a 
i.) i 

= (n — k)*. 

Secondly we have 

2 m fJ a = 2 (2 since m = m' 

i.) i j 

— 2 (m*) (1 = 2 since m = m* 

i t 

= n — k. 

Equation (6.4) therefore becomes 
^(X'mX)* = o-j 4 2 (fi t — 3) + {(» — fc) a + 2(» — fc)} <r 1 4 . 

i 

Using this expression in (6.3) we finally obtain the sampling variance of 

v 

(Tj 1 as 

(n — k)* I7(<r, a ) = o^ 4 2 (fit — 3) ni {{ * + 2 (n — k) o^ 4 . (6.6) 

i 

(This formula, of course, also applies to univariate Least Squares analysis. 
It has previously been derived by Hsu (1038).) 

In the speoial case where fi t = 3 (as in the normal distribution), the 
last expression reduces to the familiar result 

W(<Tj a ) = 2<r 1 4 /(n — k). 

For this last expression to hold, however, normality is not strictly a 
necessary condition: the only requirement is that the (independent) 
observations X t all have the same variance, and normal kurtosis. 

6b. Sampling Variance ok thk Estimated Covariance 
A very similar argument yields the sampling varianoe of the estimate 
of oovarianoe. We have: 



288 


E. H. Lloyd, On the Estimation 


(n - i)(/w 1 <r l ) = E'F = X'm'nY 


say, where 
Let 


= XVY, 


n= m'n (?£ |i'). 


so that 

Then 

and 


X = U + pa, Y = V + qb, 

<?(U) = <T(V) = 0. 

E'F = U'jiV, 

(» - *)« V(po l( r t ) = «r(U'|iV) s ~(n- kWpajrJ*. 


(5.6) 


In evaluating the expected value of (U'iiV)* we note that the U it V t are 
independent when i j, with zero expectations, so that in squaring the 
bilinear form we need retain only terms in UfVj* and UfV^fVf. 
Neglecting terms of zero expectation, we have 


(UW = (S a«W* = S Hu* + X HifUfVf 

i,l II *j 

+ 2 (Ay* + MuMjjWiUjViVj. (5.7) 

i+i 

We now take expectations, noting that 

e (!VF/) = <v<v, 

i+i 

6 {UtUjVtV,) = #(U t V t ) . tf(UjVj) = (poyr,)* 

i+i 

Let us denote by Uf(r 1 *or t % . Then (5.7) yields: 

«f(S AyCW 

Ui 

= <r x V,*{E «W + X A«* + A* 2 (Ay* + A«Ay)} 

= [S {!»< - (1 + 2p*)} a«* + S Ay* 

/ u 

+ p 1 S (Ay 1 + A«A#)1* (6.8) 

Ui 

This may be somewhat simplified by noting that 


E Ay* = 2 (np')« = tr(|*ii') 

Ui i 

= tr(m'n m) = tr(m m'n) = tr(mn) 
— tr(|i) = h, 



of Variance and Covariance. 


289 


say, and 

2 PiiPit — (S Pii)^ pjj) 

= {tr(i*)}* = A*. 

Using these values in (5.6) and (5.8) we obtain 

'*)* = S K - (1 + 2/7*)} /,„* + A(1 + />*) 

+ p*{A* — (n — £)*}• 

This is the general expression for the sampling variance of the estimate 
of covarianoe. 

In the special case when m — n (and, in particular, when p = q) 
we have 

h = n — k, 

and (5.10) becomes 

v (P°i a ») = S {|7, - (l +2/7*)} m {i * +- (n - *)(1 + (>*). (5.9) 

\<T l <r t f i 

In the case of normality, v t = 1 2/7*, and (5.9) reduces to Aitken’u 

result: 

V(P<* i<r*) = (1 + p % ) (r l t <T t i l(n - k). 
Acknowledgments 

The author is deeply indebted to the referees for much detailed nnd 
constructive advice. 


REFERENCES TO LITERATURE 

Aitxkn, A. C., 1048. “ On a Problem in Correlated Errors ", Proc . Roy . Sor. 

Rdin., A, lxii, 273^277. 

Hsu, P, L., 1038. “ On the Beat Unbiased Quadratic Estimate of the Variance ” 

Statist. Res. Memoirs, n, 90, (73). 


(Issued separately May 30, 1952) 


P.R 3 .X. —vol. tim, a, 1950-51, part in 


20 



290 


U. E. Daniels, The Statistical 


XXI.— The Statistical Theory of Stiff Chains. By H. E. Daniels, 

M.A.(Cantab.), Ph.D.(Edin.), Statistical Laboratory, Uni¬ 
versity of Cambridge. Communicated by Professor A. C. 
Aitken, F.R.S. 

(MS. received July 31, 1961. Read December 3, 1951. 

Revised MS. received January 3, 1952.) 

Synopsis 

The paper is concerned with the distributional properties of Markoff chains in 
two and three dimensions where the transition probability for the length of a step 
and its orientation relative to that of the previous step is specified. 

Tho discrete two'dimensional chain of n stops is first discussed, and by the use 
of moving axes an equation relating characteristic functions of the end-point 
distribution for successive values of n is obtained. The corresponding differential 
equation for the limiting chain with continuous first derivatives is given and 
asymptotic solutions for long chains are found. 

Tho three-dimensional chain is similarly treated in terms of moving axes, and 
the limiting continuous chain is again discussed. Finally the same methods are 
applied to the discrete chain of equal steps to obtain the asymptotic form of the 
end-point distribution for long chains. 

1. Introduction.— Since Rayleigh’s solution of the problem of 
“random flights” (1919), distribution problems associated with freely 
linked random chains have been widely discussed under the name of 
“random walk” problems (see, e.g., Bartlett (1949)). In the present 
paper we are concerned with the statistical behaviour of what will be 
called stiff chains, in which the orientation of a given link of the chain 
is influenced by those of the neighbouring links. In Kuhn’s theory of 
rubber elasticity, for example, the chain molecules are stiff in this sense, 
and are replaoed for simplicity by “equivalent 99 chains containing a 
smaller number of freely jointed links. Another example of a stiff chain 
is provided by the path of a heavy particle moving through an atmosphere 
of light particles, the deflection at each collision being relative to the 
direction of motion prior to the collision. This multiple scattering 
problem is considered by Rossi and Greisen (1941), and by Moya! (1950) 
who gives comprehensive references. 

The term “chain” is used hero in its widest sense, and since the 
chain is in general regarded as having a specific direction, it is convenient 
to distinguish one end ob the initial or starting point and to regard the 
chain as proceeding in a series of steps from one point to the next. The 
purpose of this paper is to develop the theory of stiff chains which have 
a Markoff property, the transition probability being the chance that a 
step has a given length and given orientation relative to that of the 



Theory of Stiff Chains 


291 


previous step. Many physical chain processes are adequately repre¬ 
sented by a model of this kind. 

If the orientation of the final step is alone of interest, its distribution 
can be obtained by the methods of Perrin (1928) and Goudsmit and 
Saunderson (1940), the problem being essentially that of random walk 
on a sphere. When step length is taken into account the analysis is 
very complicated, and even in tho case considered here of end-separation 
of the chain it has so far only been possible to obtain limiting approxi¬ 
mations to tho distribution. 

It is assumed throughout that each point of space may be crossed by 
the chain any number of times. 


2. The Two-dimensionaTi Chain.— Let a be the length of one of the 
steps of the chain, and let a be the angle it makes with the previous 
step, with the convention that a = 0 when the steps are in line. Set 
up Cartesian axes with origin at the end-point of the chain so that the 
axis of x lies along the final step. There is a translation and rotation of 
axes with each step, the addition of the final step transforming the 
co-ordinates of the initial point from y') to (x, y) according to the 
relation 


r > 

X 

= 

-s 

a 

+ 

cos a 

-- sin a ^ 


' x- ' 

. y . 


0 

^ j 


sin a 

cos a 

j 


. V' , 


which incidentally defines the sense of a. The reason for the unusual 
choice of axes is that the process is expressible as a Markoff chain in 
(x 9 y) alone, the relation between successive probabilities being 

f(x,y; n) = Ef(x\ y *; n- 1 ) 


-= J da J d3f(x\ y’\ » — l)g(a t a ; n), ( 2 . 2 ) 

0 -n 

where E denotes expectation and f(x, y ; n), <j(n, a; n) are tho frequency 
functions for the variables concerned. The formulation as a Markoff 
process in terms of fixed axes would require tho introduction of a third 
variable representing tho orientation of the final step. 

The characteristic functions corresponding to / and y are 


# 6 , V ) = Ee<£'+ 1 " 


Then 

0(£. V I ») - 



y(u, v)-E t iun f 

<ft(£ cos a -f- 7 sin a, — £ sin a -j- y cos a ;» — 


1 ) 


. y(a, x; n)dx. 



292 


H. E. Daniels, The Statistical 


Writing 
this reduces to 


4>iP cos \jf, p sin ^ \ n) = 4>(p, yfr ; n) 


4>(p, i[r\ n) = j e to ^“* F da j <P(p,ft — a ; n— l)ff(a, a;n)da, (2.3) 

or, in operational form [of. Bartlett (1)], 

4>(p, n) = y(p cos ijr, i ^ ; n)<t>(p, *5r; n— 1). (2.4) 

3. The polar forma of the characteristic and frequency functions are 
related in the following way : Putting f(r cos 0 , r sin 0) = f(r ( 0), 

go n 

4>(p, ifr) = /I e {pr e)rdrdd. 


and if F and <t> are developed as Fourier series, 

F{r, 6) = S F t (r)f*°, 4>(p,tfr)= 2 4>,(p)^, 


it follows without difficulty that F t (r) and 4> a (p) are Fourier-Bessel 
transforms, 


®Jip) — 2 «*j F H (r)J,(rp)rdr 


K( r ) = 2 ^ t J ®.(p)J'(rp)pdp 

0 


(3.1) 


In particular the radial frequency function is obtainable directly from 

Jl 

in the fonn 

— Jf 

00 

2nrF 0 (r) = r J 4> 0 (p)J^rp)pdp (3.2) 

0 

and fi} 9 the j 01 moment of r a about zero, can be found from the expan¬ 
sion 

** ,) -jb- v £§jr (a - 3) 

4. A few specially simple cases may be noted. The process reduoes 
to that of the classical random walk in one and two dimensions respec¬ 
tively when (i) a has both zero mean and zero variance, and (ii) when 



Theory of Stiff Chains 


293 


g(a, a) is independent of a. A less trivial case is that in which a has 
zero variance but mean a 0 , not zero or ± it. Then 

y(u, v ; «) *= y(u, 0; n) e ieao 

ftt * d ®(p, ifr; n) = y(p cos ifr, 0; n)<&(^, ifr — a 9 \ »— 1) 

9- I 

= n y(p cos {ifr — jOf), 0; n—j), 
y-o 

wince <P(p, \[r\ 0)=sl, The limiting distribution for large n will usually 
be of circular normal form by the central limit theorem. If y is indepen¬ 
dent of n it is sufficient that the variance of a should be finite and non¬ 
zero. (If the variance of a is zero the chain always lies on a circle of 
radius £ a coseo \ Oq.) When y depends on n, an additional condition 
on the third absolute moment of a ensures the limiting circular normal 
distribution. 


5. Limiting Continuous Chain. —The main purpose of the paper 
is to consider cases where the a distribution is non-degenorate. In the 
remainder of the discussion on the two-dimensional problem we consider 
the limiting continuous chain consisting of a large number of small steps, 
each deviating slightly in direction from its predecessor.* 

Assume that g(a , a) is independent of n. Equation (2.4) may be 
formally expanded as 


r/r; n)= J1 + E(a)ip cos ^ — E(<x) ^ \ E(a*)p 2 cos 2 i/r 

-E(aa)ip cob + + . . .1 

<D(p,tfr; n-1). (5.1) 

The order of magnitude of the coefficients as n->oo will determine the 
limiting form of (5.1); they are here chosen to give limiting chains 
with continuous first derivatives. Allow l = nE(a) to remain finite and 

l6t *(«') *<>> 

E(a) ’ E(a) “ ’ 

so that E(a), E(a), E(a i ) are all 0(« _1 ). Further, assume that the 
variance of a is of smaller order than E(a). Then E(a*) and E(aa) are 
o(?»“ l ) and (5.1) is 

ifr ; n) = j 1 + E{a)ip cos ifr - E(a) ^ + itf(a*) ^ 

l-ofa -1 )!- <&(/>, »— 1), 


* Mr. D. Q. Kendall has pointed out that the equations obtained here by 
heuristic limit operations can be derived rigorously by the methods of M. Kao (1940). 



204 


H. E. Daniels, The Statistical 


whenoe, writing E(a) — dt in the limit, 


-=<l, tp oo*t-K^ r + A^. 


a 


(5.2) 


The required solution of this equation must reduce to <J> — 1 when t = 0 , 
but it has to satisfy a further important condition. Since <p{£, q) is a 
single valued function of (£, ij), must have a period 2ir in \jr. 


6. Symmetrical Case. —It is simplest to consider first the symmetrical 
case when k = 0 , 


£0 

It 


i p cos t/r -f A 


dijr*' 


( 0 . 1 ) 


The required periodic solution of ( 0 . 1 ) can be written down as a 
series of Matliieu functions, 

<D = S C v ce» 


the coefficients C n being determined by orthogonality conditions to 
make <I> = 1 initially (see McLaohlan (3)). But both and the charac¬ 
teristic numbers v of ce v are complicated functions of p, and the subse¬ 
quent inversion of O to F(r , 0) presents formidable difficulties. Ap¬ 
proximations can be found for large t and large r (small p), but it is best 
to obtain these by solving the equation directly without appealing to 
the special properties of Mathieu functions, as the method is of general 
application. 

Introducing the Laplace transform* 

CD 

L= J" <S>e-v*pdl 

equation ( 0 . 1 ) becomes 

{p-ipcm$)L-\^~=p. (0.2) 

Since O, and hence L, is in this case an even function of \Jr, the cosine 
expansion 

0D 

L = L 0 + 2 X L t cos a ifr 

i 


will provide the required periodic solution. The coefficients L b , whioh 
are the transforms of 4> f (§3), satisfy the equations 


pL 0 — tpL l = p 

— 41 + (p + «*A)4 — \ipL„ x = 0, a>l 


(6.3) 


* I prefer the “ dimensionless ” form incorporating the factor p. 



Theory of Stiff Chains 295 


From the first v of these equations, L a , L x . . . L, are found to be 



L 

_pA I> „ + 2(*ip)'+ 1 4 + 1 
0 2Ao.,-pA lir 



(0.4) 

L 

_ (i»p)«pA, + ,., + (*v)'- +1 (*v.-. - 

pAi,. 

~i) A>+i 

(0.6) 

where 


% 

1 

1 

1 


(i <«<•), 


p+8*X 

-\ip 

0 0 . . . 

0 

0 

0 



P+(*+!)* A 

—\ip 0 ... 

0 

0 

0 


0 

-\ip 

p+(«+2)*A -\ip . . . 

0 

0 

0 


0 

0 

0 0 . . . 

-¥p 

p+(v-l)*A 

-i ip 


0 

0 

0 0 . . . 

0 

-i ip 

p+ r a A 


and 1 , 0 , (^^ 1 ). 

The determinants A,, diverge as v->oo, but may be replaced by 
A gl „ = (p + « 2 A)(p-f (« + 1 )*A) . . . (p + **A)A <(1 , where A,_, has unit 
diagonal elements and converges for all p and for all p^O. It can 
then be proved from (0.6) that L H -+Q as «->oo, and hence that 

jj — _ ^ ij» _ t _ _ (aV 3 )" A, + 1.<* 

° (2^0,00 —pA,_ a )’ *' (P + A) . . . (p + *M)(2A**-pA lt «j 

(Ol). 


7. Asymptotic Distribution for Lono (Chains. —When t is large 
one expects intuitively that the distribution will approach the circular 
normal form with r* = 0 (t) over the effective range of r. It is therefore 
reasonable to investigate approximations to L, for small p and p with 
p = 0(p*). To indioatc the order of the approximations, let T denote a 
typical value of t. Terms which are 0(p tl ) or 0(/) & ) arc token to be 
0(T _ “) in the sense that after the change of scale t = t'T, r = r’T i , T~* 
appears explicitly as a factor in such terms, the corresponding variables 
p', p' being 0(1) in T. 

Both A and A arc 0(1), so that L„ — 0(T *'*), «> 1 . Using the 
relation 

\,p — (P 4" **A) A„ + lt , + \p l A, + j,, (7.1) 

(6.4) and ( 0 . 6 ) become 



(7.2) 



206 


H. E. Daniels, The Statistical 




(7.3) 


from which approximations to the distribution can be found for large T, 
as successive terms of an asymptotic expansion. Further application of 
(7. 1 ) shows (7. 2 ) to be the u 01 convergent of the continued fraction 

\p* ip* 


r _ _P 

0 p -f p + A + p + 2 *A + p + 3*A + . . . , 


(7-4) , 


all subsequent numerators being \p k . 
The first approximation (v = I) is 


A, =zMt +°< T -> - -'-a + 


P(P +*) + iP* 

¥pp 


p + V 


L y = --jffi -■ . + 0(T ~»/») = |* . - y — +0(T-»*) 

P(P+ A) + |/P A / + jP*\ 


L. = 0(T-«*), 
Thus 


2 . 


(7.5) 


-( 


1 + ^ooe*) P +0(T-'), 

A >+»5 


which is the Laplace transform of 

<D = ^1 + ^ cos \Jrj er * f 1 + OfT -1 ). 
The Cartesian form of (7. fi) is 

^ ^ e" * ^* + V> + 0(T- 1 ) 


(7.6) 


= exp |^_^(g* + ,*)j+ 0 (r-i), 


(7.7) 


which is the characteristic funotion for a circular normal distribution 
centred at the point ^ oj, the variance of x or y being </A. 

If the displacement of the centre is ignored the distribution may be 
considered circularly normal about the point (o, o), but the terms 
neglected are then 0(T~t) instead of 0(T~ 1 ). On the other hand, the 
distribution of r depends only on L 0 (of. (3.2)), and is therefore 

2nrF Q (r)=-. r *e-lT{l+0(T-% 
the displacement of the centre having no effect to this order. 



Theory of Stiff Chains 


207 


For the higher approximation it its beat to expand (7.2) and (7.3) 
aa a Laurent aeries 


L.= XA. 


P m, 

( p+i x) 


treating p and p* aa 0(T ~*), the A’s being polynomials in p. The in¬ 
version of the Laplace transform is 




values of m<0 giving no contribution.* The Fourier components of 
the frequency function are then most simply obtained from (3.1) using 
the formula 


GO 

/ 


pf+im + i c - \p'r J t ( r p) dp 

_ 2 m r* (Tfl + m+l) mr(Z -(- m + 1) /r 2 \ 

- T i+»+ifp(i + jj ~ r(Z + 2F \ 2 t/ 

_j_ m(m — 1) T(l + ?» + !) (r *\ a 


2 ! 


l + i) (r*y I ;■ n a , 

r(i+3) ' Urj + ' ' I ’ (7 * 8) 


where m is an integer and the series terminates. 
Proceeding in this way, we find for v = 2, 

K = j 1 + J £ ■- & £ *) + 0{T -*) 

<*>i = j i + U % - A % «] ' 1 * l + 0(T- **) 


®l=-T , lT^fi- 4 ^ + 0 (T- a ), 


and the second approximation to the distribution is 


F(r, 8)rdrd6= X - r drd8e~if 
t lit 


1 


1 . .^ 3 _ 7 

M + *t* 


, f ia7^_ 7 Arfl 

+ [j * A«* + < 3 11,2 l 4 J 


cos 0 



+ *£oos2ff + 0(r-»)j-, (7.9) 

which is easily expressible in Cartesian fonn if required. 

* In the inversion of the transforms, the contour of Bromwich's integral is 
modified where necessary to end in the second and third quadrants. 



208 


H. B. Daniels, Ths Statistical 


8. Short Continuous Chains. —It is also of interest to examine the 
behaviour of the distribution for small t, the chain being then almost 
straight. For this purpose Cartesian co-ordinates are more convenient, 
and the differential equation (6.1) is replaced by 


8t ( 1 d? ^' 


e£dij^ % it 6 fig W’ 


On substituting 


<f> — £ £ (i jk 


m (ivi 
j\ k\ 


( 8 . 1 ) 


in (8.1), differential equations are obtained relating the bivariate moments 
H jk about the origin. Thus 


<^io_ 

H 


1 - Vio> 


^oi = 

H 


01 


- ‘ir =/>ia+A(/< ” + 


< 7 <n 

it 


— /hi 4- 4A/i n 


1 fy'oB 
4 it 


^(/ r ao 4' Put)- 


All the moments are initially zero except //<*, = 1, and the solutions of 
these equations are 




t 


/ ( oi = 0 


^20 1 iis4' ^4“ 


3A* 

4 


A 4A* 
t r> 

/ ' 0 *" A 4A 2 ‘ 3A 2 
/'u =-- 


,-AI 


l 

12A* * 
1 

12 A 2 * 


-4Ai 


i-AU 


( 8 . 2 ) 


When t is small we find 


var xr+sif AH 4 , vary^jfA* 3 , (8.3) 

where var * = //„, — /<*,„, var y = // M . As might be expootod, var x is 
of smaller order of magnitude than var y, but an unusual feature is that 
var y increases as the cube of the chain length (of. Rossi and Greisen, (1941), 
p. 208, Moyal, 1950, p. 1002). This differs essentially from the corres¬ 
ponding result when the deviation y of the end-point is the cumulative 
sura of n independent random lateral displacements, since var y is then 
0(t). The results (8.3) are easily verified directly ; for example, suppose 
all steps to be of equal length a, and let a t , a, ... a„ be the angles made 
by each step with the previous one, so that 



Theory of Stiff Chains 


200 


Then 

and 


> = a E sin (a, + a 3 + . . . + a,). 

7-2 


<r y 2 ~a 2 <r a 2 E (» —j -f- l) a 


7-2 


a E {»—j4-i) a 

7-2 

'n 3 aV„ 2 


A = i O’a*/ 0 ! * = 


To examine the limiting form of the distribution for Bmall t, let T be 
again a typical value of t, now a small quantity. Then if t = t'T, (8.3) 
Hhows it to be appropriate to write x — x'T, y = y'T w, §=£'T~ 1 , 
y = ifT" m , when (8.1) becomes 


^ = ^ + AV*| i+<HT). (8.4) 

Ignoring 0(T), the required solution is 

showing that, to the order considered, x has the constant value t and 
the y distribution is normal about zero with variance I \At 3 (as may also 
be deduced from the central limit theorem). 

It will be observed that the moment formulae (8.2) are exact. Similar 
exact formulae for the moments /ij of r* about the origin are more simply 
obtained from (3.3); if L„ is expanded as a power series in {ip and 
substituted in (6.3), the ensuing recurrence relations between the coeffi¬ 
cients L„ m enable L o t} and hence /ij to bo calculated in closed form if 
required. 


9. General Case. —Tho preceding methods are applicable to the 
more general equation (6.2), 

, = <Pip COS — X „ -• +A r -r., 

tIt H Y W tlijr* 

and it will suffice to indicate tho limiting form of the distribution for 
large t. Again ® must have period 2 n in i]r, but is no longer symmetrical 
in ijr. Corresponding to (6.2) wo obtain 


/ • ,,r, 

(p-tpcotnfi)L + K = 

whore now 

L= E 

I-= — co 


(9.1) 


The coefficients satisfy the equations 

— i *pL,_! + (p + i«K + « 2 A) Ly—{ipL ¥+l = p (« = «) 

= 0 («^ 0 ). ( 9 . 2 ) 



300 


H. E. Daniels, The Statistical 


For large T It can be shown as before that L, = 0(T~t^) for s^l, 
and the dominant term of the asymptotic expansion is got by patting 
L„ = 0 when | a | > 2 and ignoring terms which are 0(T~ l ) in the express- 
sions for L 0 , L +1 . Inverting the Laplace transform we find 

— e~* + 0(T- 1 ) 

*±i =1 v 

ho that 

0 = e 1 <* + ->jl + (A cos \}r + K sin ^)' 4 -0(T~ X ). 

t 

i 



Fig. 1 


To the same order, the limiting distribution is therefore circularly normal 
about the point (^ a 1 A*^*)’ with Vftriance of x or y equal to 

A</(A* + x 2 ). 

10. The Three-dimensional Chain.— As before, a system of moving 
axes is employed with origin at the end-point of the chain. The vector 
increment for each step is now specified by ita length a and two angles 
a, /} defined os follows: To preserve the conventional polar notation 
the axes ore chosen so that the z axis contains the final step, while the 
(z, x) plane also contains the previous step, that is, the previous z axis. 
The angle between the last two steps is a as before, while ft is the angle 
turned through by the (z, x) plane. Thus (a, fi) are the usual colatitude 
and azimuth angles specifying the direction of the final step relative to 
the previous axes. The spherioal diagram (fig. 1) shows the angular 
disposition of the systems of axes (*', y\ z'), (x, y, z) adopted respectively 
before and after a step has been taken. 



Theory of Stiff Chains 


301 


The oo-ordinates of successive points of the chain are connected by 
the relation 

"ar"| = rol-(-r cos a cos fi cos a sin /? — sin a 1 T x' 
y 0 — sin/? cos/? 0 y' , (10.1) 

.2 J a J sin a cos ft sin a sin /? cos a J L . 
and the basic probability equation for the chain is 
f(x, y,z ; ») = Ef(x', y’,t' \n — 1) 

so n 2ai 

= J A* j* dftg(a, at, n)f(x',y', s'; »—1), (10.2) 



where p(a, a, /?; n) is the frequency function for a, a, /? at the n** 1 step. 
Note that for the classical random walk case where all directions of the 
step are equally likely, g is not independent of the angles but contains 
the factor sin a. 

The characteristic function $4(£, i), £; +*»+«£) satisfies the 

equation 

= »-i) (10.3) 

where 

K V C] = H' V' S’] cos a cos fi cos a sin /? — sin a 

— sin /? cos /? 0 

. sin a oos /? sin a cos /? cos a 

and E averages over the a, a, /? distribution. 

It is again convenient to work in polar co-ordinates 

x = rsin0coBX £ =psin ^cosw 


302 


H . E. Daniels , The Statistical 


y = r sin 6 sin x 7 = p sin ^ sin (o 

3 = r cos 0 £ p oosi/r. 

Fig. 2 shows the relation between the axes (£, 7 , £) and (£\ 7 ', £'), and 
the corresponding polar angles. 

Equation (10.3) becomes 

<b(p, n) = Ee ta P Ci *t 0 /; n—1), (10.4) 

where 

cos rfr* — cos \jr cos a + sin sin a cos ((0 — fi) 
sin ijr' sin w' — sin ^ gin (to — fi). 

There does not appear to be a simple operational form similar to (2.4). 

For simplicity the discussion is now confined to cases where the 
angular distribution for each step is axially symmetrical about the direc¬ 
tion of the previous step, and is independent of ». The treatment of 
the general case, though analytically complicated, introduces no new 
difficulty of principle. Since 4> is assumed independent of we may writo 

<b{p, ijr, to ; ») = ^ <b(p, ijf ] n) 


g(a, a, ft) = ^g( a> at), 


and (10.4) reduces to* 


m n In 

n) = J e ia i>"»*daj g(a, a)dot j ^ 0 (p, W ; n — 1 ), 
0 0 0 


where 


cos xjr' = cos rjr cos a + sin \jr sin a cos to. 


(10.C) 
( 10 . 0 ) 


11. The polar forms of the characteristic and frequency functions 
are related by formulae analogous to those for two-dimensions. In the 
axially symmetrical case, writing 


/(*, V, 2 ) = F{r, 6, x) = F[r, 0) 


we find 


<x> n In 

<]>(p,t) = j r>drj sinddoj F{r,6 )* «,. *+no * .n. * (11 ]) 


iSeries of Legendre polynomials now take the place of Fourier cosine 
series, and use is made of the expansions 

etrpu = j£ * 

** r P *-0 

* We now replace w — p by to. 


( 11 . 2 ) 



Theory of Stiff Chains 


303 


P„( cos 0 oos rjf -+- sin 0 sin ooa — to)) 

= P„(oo8 6) P,( ooa tfr) + 

2 P, w (co8 6) P“(cos \ir) cos »i(x — w), (11.3) 

where P,(u) and P„ m (u) are the ordinary and associated Legendre poly¬ 
nomials. On substituting (11.2) for the exponential in (11.1) and 
expanding the Legendre polynomials by (11.3), terms in oos m(x — to) 
vanish on integration and (11.1) becomes 

®(P, 'Jf) = P S i'(* + i)^,(cos rjj) 

S p ,_o 

eo j% 

J* + i(rp)dr J* P a (cos 0)F(r , O)&\i\Od0, 
o o 

showing that if 

&) = 2 <D H (p)P H (coa i/r) § F(r , 0) = 2 F H (r)P a (cos 0) 

i-0 j-o 

the coefficients are Fourier-Bessel transforms, 


Tn particular 


___ CD 

^>*(p) = J 2 - *' J TO j, + ftp) r*dr 
0 

03 

o 

cn 

%(P) = 2 p j TO sin rp . rdf, 


(11.4) 


or, since the radial frequency function is 2r*.F 0 (r), 

o 0 (/>)= i ){~y Vjp 11 , 
y r 0 ' ' ( 2 j + 1 )! 

where p. i is the j 01 moment of r* about the origin. 


(11.5) 


12. Limiting Continuous Chain. —The limiting continuous chain is 
obtained by making a and a small in such a way that E(a*)/E(a)-* 4A, 
and assumin g var a to be of smaller order than E(a). The angle a os 
now defined lies between 0 and n, so that E(a) is always positive except 
in a degenerate ease, whereas in the two-dimensional case a lay between 
— it and n, and the analogue of axial symmetry then had E(a) = 0. 
This is a matter of convention and should cause no confusion. 



304 


H. E. Daniels, The Statistical 


Expanding the required order, 

cos rjr' — cob a mntjr cos w — ^ a* cos* ift + o(a*) 


<&„ _ !(p, ^') = _ i(p, ^) + (a sin ^ 008 w - ja* cos* f) 


3(oob ifr) 


+ ,*a* sin* cob* Ci) + o(a*). 


Putting t = nE(a), dt — E{a) and u = cm \jr, (10.5) becomes in the limit 

s'=^+ A Ji|, , -“C!- < i2i > 

The Laplace transform L of <l> with respect to £ satisfies the equation 


(p-ipu)L- A ! (1 - «*)£* =P- 




'0tt| 


( 12 . 2 ) 


The analysis then proceeds as in the two-dimensional problem, exoept 
that L is expanded as a series of Legendre polynomials in u = oos iff, 

L=lL t P,(u), 

1-0 

which, after substitution in (12.2) and simplification by well-known 
recurrence formulae, leads to the equations 

pLo— a ipL t = p 

~ if> (28- i) 4-i + {* + «(« + i) A} h„- i = °- 

These are solved as before, the approximation for large T being 

ihf — + 0(T- | 


(12.3) 


L 0 = P^’ + O(T-“)= . T 1A 


4 = 


M, + 1 . 


where 




li/_4- OfT-' + l* -1 

In*A- ^ 1 


(12.4) 
• >1, (12.5) 


J+ ' ( * +,)A "Iw 
0 if+tr ?+<*+ 2 ><*+ 3 > A 


0 

« 

0 


0 

0 

0 


(sfi) ?+«»+'* 



Theory of Stiff Chains 


306 


If desired, L 0 can be expressed as the continued fraction 


t _ P 1-3 3-6^ 5-7'_ 

0 p + P+1-2A+ p+2-3A + p + 3-4A+ . . 


of which (12.4) is the V th convergent. 
The first, approximation (v = 1) is 

i. — P i. - 


* +0(T-‘), ^ + 0(T _i ) 


4-0(T-»*), «>2, 

which transforms back to 


<D=^l + i^oos^e-‘x < + 0(T- t ) 


or, to the same order, 

* = exp jJ-* - (C* + V* + C 2 )} + 0(T~i). 

The limiting distribution is of spherical normal form centred at 

(o, 0, with variance </3A in each of the three co-ordinates, the 

radial distribution being unaffected by the eccentricity to the same 
order. 

The second approximation {u — 2) is 

= | 1 + 12 — loio 1 \ e ~* + 0(T~ % ) 

stt* Jr *} e * + 0(r-l) 

<P 1 -_ T i s ^e-*r‘ + 0(T-»), 

and using (11.4) in conjunction with (7.6), we find 

F (r, 0) r» sin ddrdd = ^ * i + 2 £ ” ** **] 


+ [t v- M i ■+ W £-$§£] *<« *> 


+ i| J P l (oosd) + 0(T- , )J. 


( 12 . 6 ) 


P.RJ.E.—VOL. LX in, A, 1050 - 51 , PABT Ul 



306 


H. B. Daniels, The Statistical 


The behaviour of the chain for email T can be examined ae in $8, 
using the analogue of (8.1) in cylindrical polar co-ordinates. It is found 
that, ignoring 0(T), z has the oonstant value t, while (x, y) has a circular 
normal distribution about zero with variance $ At* in x or y. 

13. The Discrete Chain of Equal Steps. —We return to the discrete 
chain and discuss the important special case where the steps are all of 
equal size, the modification necessary for variable steps being obvious. 
The assumption of axial symmetry is retained. 

It is convenient to assume the steps to be of unit length, the restriction 
being removed at the end by writing r/a for r. From (10.6), the gene¬ 
rating function 

G(p, ijr -, Z) =* I <b(p, ijr ; n)Z n 

M-0 

is seen to satisfy the equation 

8 2 * 

0(p,(fr; Z)=l+Ze i f’ am t j da j ^g(a)G(p,f' \ Z), (13.1) 

o o 

where 

cos \jr' — cos 5 Jr cos a + Bin \Jr sin a cos w 
and g(a) is written for g(a, a). Expand G in the form 

G(p, \Jf ; Z) = S G t (p ; Z)P t ( cos ^). 

The polynomials P t ( cos ijr') occurring in the integrand may be effectively 
replaced by P t (ooe f) PJ( cos a) using (11.3), and (13.1) becomes 


E GJ> t ( cos ^) = 1 + Z e* « * Z fli&P.(<»s rjf). 


where 


so that 


n 

-j 


g(a)P t (oos a) da, 


It follows that 


g(a) = sin a E (s + t)fcP f (oos a). 
»-o 

O m = 6„a + (m + $)Z2, Gjj/)^, 


where =1,0 for m — 0, m > 0, and 


I 

= j* e if,u PJu)P,(u)du 


(13.2) 


(18.3) 


(13.4) 


(13.6) 


( 13 . 6 ) 



Theory of Stiff Chains 


307 


In particular, 

c o* == ' *J~p 


The c’a satisfy the recurrence formula 


.(2«+l) 0 _ 8 

* (S+l) dp C '*‘ (8 + 1) C ’ 


from which power series in p can be constructed. The oonstant term 
in is obviously 2/(28 +1). As far as 8 — 2, in -= 2, wo find 


C 00 — 

2- 

-hp*+ 

A P* ~ • 

• ■ 

C 01 = 

C I0 

= H *p ~ 

-jts'P 3 1 

r In V*- 

e n — 

u 

TJ 


ffV P 4 — • 

■ • 

r o* = 


= — T*7> 

p' + Az 

P*~ . . . 


C tl 

= A ip 

~ tos V* + aan V 5 

c„ = 

s 

“ ins P* 

+ T*n/»* 

— . . . 


r ( l3 - 8 ) 


14. Consider first the solution of equations (13.ft) for the classical 
random walk where each step has a random direction, that is, 
g(a) = $ sin a, g 0 = 1, g, = 0 (« > 1). In this case 

“ ^mo + + i) 7j O 0 c m0 , 

so that 

Go=(l — — (I — Z sin />//>)- 1 

and 

0= l+ZQ a S (m + $)c m0 PJ cos #) 

m ^-0 

= i _l Ze*p^ 

1 -Z^P 
P 

The coefficient of Z n is 

Q(p, rjr ; n) = e i P**+ 

whioh is not independent of rjt but contains the factor e*?***, due to 
the fact that one step of the ohain must always lie along the i axis. The 
same result can, of course, be got by simpler methods. 

The distribution in the general case is again difficult to obtain in 
dosed form for all n from (13.5), though exaot expressions for the 



308 


H. B. Daniels t The Statistical 


moments of r* are derivable (see the remark at the end of $8). Approxi¬ 
mations can, however, be found as before for long chains. For this 
purpose it is best to treat 

L = pG(p lV Sr; e~P) (14.1) 

as the Laplace transform* of C> with respect to n, and to seek an asympto¬ 
tic expansion for large » by the methods of the preceding section, with 
the following justification. Since 

<»(P, ft\n) = ^ j 0(p, ft ; Z) 

c 

the contour C enclosing the origin but no singularity of Q % we also 
have 

0(p, ft) n) -= 1 . (* G(p, ft; e~P) eP‘ l dp, 
a 

where C is an open contour in the p plane joining points on J(p) = ± tt 
and having on its left all the singularities of O in the strip. By a suitable 
choice of (7, C' can be made to pass to the right of the origin but to end 
in the second and third quadrants. On making the change of scale 
p 9 = pN y where N is a typical (large) value of », the new contour in the 
p' plane approximates to the modified Bromwich contour; the effect 
of replacing it by the latter is to introduce terms which are exponentially 
small and so do not enter into the asymptotic expansions. 

The equations (13.5) now take the form, analogous to (12.3), 

-2per~L 0 (c w ~2eP) + L l g 1 c 0X + L4#n+ • • ■ 

0 — LfP 0l + “ :1 eP ) H • 

0 = H" ^a(Wa« — i eV ) 

and the formal solution is 

(14.3) 

''<30 

where, if = 

• • 

• • 

• • 

4o • • • Kn 

and Dg^,, is the cofaotor of in D f . 

* The form <14*1) arises immediately if the Laplaoe-Stieltjaa transform is 
used. 




Theory of Stiff Chains 


300 


The existence of the limiting form (14.3), at least for large n, and 
the order of the approximation when the determinants are replaced by 
and D w can be established in the following way. It will be sufficient 
to consider L 0 : 

For large n we again seek approximation for small p and p with 
p = 0(p*), and if N is a typical value of n, p and may be taken as 
0(N~ *). Consider the order of magnitude of l^. From (13.6) it appears 
that 0 ^= 0(N‘ _j| ). If g(<x) is not degenerate, g 0 = 1, | & | < l 

1); hence Z m# = provided m and & are not both zero, 

but l w is anomalous since 

loo = c oo — 2eP - — hp* ~ + . . . — 0{N-'). 

Also l^ is bounded for all m and and l ntt =0( 1) for s > 1. Hence on 
dividing the appropriate columns of T) tiw and l) ¥ by l lv Z n , Zaa» ■ • • 
convergent determinants are obtained and the existence of the limit 
(14.3) follows for large N . Moreover, from Schweins’s theorem on the 
ratio of two determinants [Ait-ken (4)], 


A»,r + 1 _ Aw.y I n D\ + Lr + 1 

I) D ‘ w " +1 D D 

U * + \ l/ w U v U w + \ 


(14.4) 


By extracting factors from rows and columns so as to reduce diagonal 
elements to terms which are 0(1), it can l>e shown that the second term 
in (14.4) is 0(A r ~ l ' + 1 ). It follows that 


L q = - 2 j»? + (14.5) 

and in a similar way we find 

L t = — 2pe» + 0(N~ • + *—*), (14.«) 

Successive terms of the asymptotic expansion for large N may then be 
calculated as before. 

15. The first approximation is 


A> = 


r r~~\ H °( N l ) 


(««SiSS 


L, = (1 * g ) L ° + 

shoving the limiting distribution to be again spherically normal, centred at 
(0, 0. o/(l — £,)) with the variances of x, y, z equal to a* = £ \ ^ jno* 
(of. Eyring (1932), Moran (1948)). 



310 


B. E. Daniels, The Statistical 


The calculation of higher order approximations entails much tedious 
algebra, but the formulae are simplified by the change of scale 

The next approximation to 4> a iB then found to be 

* _ I,. ». p*. r , t»+Vk+*fc*). . p*i 

4,, -t 1+ ri-0,')« + l 0 " d-0,■) +i »(i-J«i 

*_/ * u. r„(i+30.+».*) ,d+0ji<p* 

*'-J iKl^flr + l‘ (i-0,'•) — 5 (i—0,)J » 

_L T i O + ^j + Ssr,*) , (1 +0i)l *P*I _»/*• , o(jj-t) 

+ L in (i- 0 ,*) +,o (i- 0 ,)J »i* + ( ’ 

„ 1 ,P«e-l'"+0(JV-») 

n(l-ffi) 

and the corresponding distribution to this order is 

~ B*fAnddRdOe-t# 

J2tr 

if 


!/,_ 9 (i+i/!*), 3 (i+j/,), r(3+ 2g >+wi ii+g.n/p 
■|\ ^(l-O^antl-g,) r L 2(1— g x % ) (l-l/ t )J » 

'l”' d-0,*) +I ”<r-0.)J n) 

+ P tom 0) / 3 P + f_(' 3 + < 0i+ 13 0.*)..(l+0.)l* 

+ P,< °“W( 1 -0,*)U + t 4(1-0,*) + *(1-0,)J»* 

[(17 + 10^ + 17jT|*) _ „ (1 + g t )] B* 

L 10 ( 1 -^*) *( 1 - 0 ,)J» S 

,[ i (3 + 4* + 3*«) x (1 +*)1 JP\ 

+ r a ”' (i -0,*> +I "<r-0.)M 

+ Pj(COS d) . 


ti J TV? + 0{ ^" ,, !‘ 

(1 — 9t) » ) 


(15.1) 


where if is to be replaced by - 7^—^ ^-. 

1 *n (l + *)o* 

It has been checked that for the limiting continuous chain where 
1 = na, E(a*)r+,4\a, g x ^l — 2Ao, j^l-flAo, (15.1) agrees with 
(12.0). The terms in P x and P, do not enter into the radial distribution; 



Theory of Stiff Chains 


311 


setting g r = g t = 0 for a freely linked chain, we find for the distribution 
of R, 


J 


2 R*dRe - * * -1 - 3 + i *** - - - + 0(N~*)\ 

7i r I 4n * n aa n ' ’\ 


with iP= , in agreement with Rayleigh's result to this order. 

TiOr 

For a chain whose links have complete conical freedom, x has a 
fixed value and g g — P # (oos a). The particular value cos a = \ (g x = 
g t == — has a physical application. The radial distribution is 


J 


2 IPdRe-l*' \ \ - I*** ■ 

7T 10n » 


Ri ) 

l? K + 0 (N-*) 
n I 




3r* 

2na*’ 


which should represent the distribution of end separation of long paraffin 
molecules in random motion under conditions whore conical freedom 
may bo assumed (Treloar, 1040, p. 40). Approximations of this type 
are, of course, not valid for calculations of entropy at high values of R. 


REFERENCES TO LITERATURE 

Aitken, A. C. p 1048. Determinants and Matrices, Oliver & Boyd. 

Bartlett, M. 1949. “ Some Evolutionary Stocliastio Processes M f Joum. Roy. 

Stat.Soc, B, XT, 211-220. 

Eyeing, H m 1932. “ The Resultant Electric Momont of Complex Molecules ”, 

Phys . Rev., xxxix, 745-748. 

Goudhmit, S., and Saundersox, J. L., 1040. 14 Multiple Scattering of Electrons ”, 

ibid., Lvn, 24-29. 

Kao, M., 1949. 44 On Distributions of Certain Wiener Functionals ”, Trans. 
American Math. Soc., lxv, 1-13. 

McLaohlan, N. W. f 1947. Theory and Application oj Mathieu Functions, Oxford. 

Moran, P. A. P., 1948. 44 The Statistical Distribution of the Length of a Rubber 

Molecule ”, Proc . Camb . Phil . Soc., xltv, 342-344. 

Moyal, J. E., 1050. 44 The Momentum and Sign of Fast Cosmic Ray Partioles ”, 

Phil. Mag., xu, 1058-1077. 

Perrin, F., 1928. 44 fitude Math^matiquo du Mouvement Brown ion do Rotation ", 

Ann . Sci. Ec. Norm. Sup., xlv, 1-51. 

Rayleigh, Lord, 1919. 44 On the Problem of Random Vibrations and of Random 
Flights in One, Two and Three Dimensions ”, Phil. Mag., xxxvu, 321-347. 

Rossi, B., and Greiskn, K., 1941. "Cosmic Ray Theory”, Rev. Mod. Phys., 
xm, 240-309. 

Treloar, L. R. G., 1949. The Physics of Rubber Elasticity, Oxford. 


(Issued separately May 30, 1952) 




( 3»3 ) 


XXII.— Artificial Holograms and Astigmatism.* By G. L. 
Rogers, M.A., Ph.D., Department of Physics, University 
College, Dundee. Communicated by Professor G. D. PRESTON. 
(With One Plate and Four Text-figures.) 

(MS. received August 14, 1951. Revised MS. received November 22, 1951. Addendum 
received December 14, 1951. Read December 3, 1951) 


Synopsis 

Experiments in diffraction microscopy, previously described, are here continued. 
Special emphasis is now laid on verifying the theory by the production of an " artificial ’* 
hologram, by non-diffractive means, from data calculated for a relatively simple object. 
The assumed object is then reconstructed in the usual apparatus. 

A type II linear zone plate of limited width is studied as a particular case of an artificial 
hologram. It gives rise to an unexpected black artefact, which is explained by a detailed 
analysis of thi 9 particular zone plate, and is shown to be due to its limited extent. 

Experiments on twisting the linear zone plate skew to the reconstructing beam show 
that the effective focal length is affected astigmatically by a factor proportional to cos 1 0, 
where 0 is the angle of twist, for lines parallel to the axis of twist. Lines perpendicular 
to the axis of twist are unaffected. 

The production of a hologram in an astigmatic pencil and its subsequent recon¬ 
struction while skew to a parallel beam is described. It i9 found that the focal length 
differences can be corrected in this way, but that the lateral scale factors are only partially 
rectified. 


1. Introduction 

In an earlier paper (Rogers, 1951) a number of experiments were described 
in diffraction microscopy in which holograms were prepared by diffraction 
from a variety of objects, and reconstructed images subsequently obtained 
under different conditions. It was felt that the theory of the method would 
be strengthened if it were possible to calculate the shape and intensity of 
a simple hologram and construct it by quite other means than diffraction, 
subsequently obtaining an image of the assumed object in the normal 
way. The first part of the paper deals with attempts of this kind. The 
second describes some observations on astigmatic effects in diffraction 
microscopy which followed at once from the first experiments. 

2. Black-and-White Artificial Holograms 

In a sense, an ordinary zone plate may be regarded as the calculated 
hologram of a scattering point, with a rough approximation to the intensity 

* This paper waa assisted in publication by a grant from the Carnegie Trust for the 
Universities of Scotland. 



314 


G. L . Rogers 


variation in the abrupt black to white transitions of the zone boarders. The 
results of the work with the zone plate arc so well known that it is not 
proposed to discuss it further here. 

The next simplest object is a scattering line, and the linear zone plates 
considered in I are the appropriate black-and-white artificial holograms of 
objects of this type (with varying phase-shifts in the scattering line). A 
fairly detailed theory has already been given, but the experimental work 
was incomplete at the time of going to press, and is here briefly discussed. 

A finite type II zone plate was produced, that is, one based on the 
series 

&***i^, 3i> 5t» . . , 

where v is the distance along Cornu's spiral corresponding to the zonal 
edges as seen from the focal position. It has a central white zone, with 
eight black zones and seven white zones on each side, and an extensive 
exterior region of white, corresponding to a vector from the last v point 
(v* = 3ij) to the convergence point (J, J) of the Cornu spiral. We assume 
for this discussion that the Cornu spiral has a convergence point, though 
C. L. Andrews (1951) has challenged this assumption. It is possible that 
a careful study of the theory of finite linear zone plates might decide the 
issue. 

The linear zone plate behaved entirely as predicted in the first order, 
giving the usual wave-length variation. The theoretical focal length is 
obtained from the dimensions of the zone plate quite easily. The distance, 
x 9 from the centre line to any zone boundary is related to the v of the 
Cornu spiral by the relationship: 

x~vV(y\), 

where / is the focal length of the zone plate to wave-length A. 

Transforming this, we get 



A set of values for x for the edges of the zones was obtained by travel¬ 
ling microscope, and these were squared and divided by the appropriate 
values of v*, viz. i$, 3$, 5$, etc. The value of (xjv)* thus obtained was 
reasonably constant, there being a slight indication that the dark zones were 
a little larger than they should be, possibly due to lateral spread of the light 
during printing. There was also a suggestion that the central zone was a 
few per cent, too narrow, of which more later. The average value was 
(3* 11 ±0*05) x io~* sq. cm., giving/ for the mercury green equal to 114 cm. 



Artificial Holograms and Astigmatism 315 

The measured value was 115 cm., in better agreement than we have a 
right to expect. 

Because of the finite size of the zone plate high-order effects are not to 
be expected, as we shall see. But an important anomaly was observed. 
A black line was formed in the centre of the field when the eyepiece was near 
to the third-order position. It was, however, clear that the position of this 
anomaly was not exactly at the third-order position, but was a little 
outside it. That is, it corresponds to an order slightly less than three. 


3. Theory of the Anomalous Dark Line 

It was therefore decided to explore this phenomenon in theory. In 
order to do so, we develop further the theory of the Cornu spiral given 
in I. The analysis is ultimately graphical, the central vector being plotted 
on the Cornu spiral, and those due to the outer zones being calculated in 
magnitude and direction, and being drawn in to form the vector polygon. 
A sample polygon is given in fig. 1. 

It is shown in I that the slope and radius of curvature of the Cornu 
spiral are readily obtained from its intrinsic equations s = v, giving 

pzsii/irv. We can therefore transform to polar co-ordinates centred on 
(|, $) for all portions of the spiral over t/ = 2, as the point (^, J) is then 
always close to the centre of curvature of the spiral. The lengths of 
vectors in the outer zones can readily be calculated in terms of these polar 
co-ordinates to an accuracy of a few per cent. The initial line of the polar 
co-ordinates is conveniently taken parallel to the x or C(v) direction. 

Now a high-order effect is obtained by approaching the zone plate more 
closely than its normal focal distance, whereby the fixed markings subtend 
a larger angle at the new point of observation, and thus correspond to 
positions further from the centre of the Cornu spiral, as drawn from this 
new point of observation. If the new point of observation is at a distance 
fjn from the zone plate, / being its primary focal length, the zone bound¬ 
aries will now occur at values of v on the new Cornu spiral given by 

v*-nx(i\, 5i • • • • • ■)• 

If n is an odd integer we may expect unusually great effects in some cases, 
corresponding to high-order foci on an ordinary zone plate. But for the 
purpose of this analysis we shall assume that n can have any value. 

In our particular case, the contribution from the inner zone is obtained 
graphically, and that from v % =n x 1J to v*=n x 3$ is blacked out. The 
first outer zone is thus that from v*=n x 3J to v*=n x 5 The vector to 



3>6 


G. L. Rogers 


which this can give rise can be obtained at once by solving the appropriate 
triangle in polar co-ordinates. This procedure is repeated for each outer 
vector, until the vector from x 31 J to the point (J, J) is reached. 

This vector is obtained from the polar equation directly. 



Fio. 1.—Cornu 1 * ipiral. 


In order to search for an anomaly in the region just below *=3, 
calculations have been made for values of n of 2-90,2*91, 2*92,2*93,2*94 and 
2*95. It was found that the lengths of the outer vectors, including the 
final vector to (|, $), did not change at all rapidly in this region, and could 
be represented to better than j per cent, by the values for n =2*92 5. The 
main change occurs in the angular arrangement of these vectors, which is 














Artificial Holograms and Astigmatism 


317 


regular one to another. The external angles of the several polygons vary 
uniformly from 36° for n = 2*90 to 18 0 for w = 2-95 (and is, of course, zero 
for « = 3*o). The final vector, E % representing the open field from 
v % —n x 31J to infinity, can be shown to subtend an external angle at the 
end of the polygon which is exactly J of that ruling up to that point. 

The polygons corresponding to these six cases were plotted from the 
data thus obtained, and the ends of the polygons carefully noted. Fig. 1 
shows one such polygon for » = 2*93. The central zone vector is marked 
0, and is drawn from the origin to the point where v*=n x 1^. The outer 
vectors have been drawn in and suitably numbered. The final point, at 
the end of the vector E t is numbered 3. The end points of the other five 
polygons have been transferred to this diagram and marked o, I, 2, 4 and 
5 respectively. It will be seen that the end points sweep out a smooth 
curve with increasing w, and that this curve ^ passes close to the origin at 
w = 2 ’ 93 - 

The darkness thus produced does not look very marked on the Cornu 
spiral, but this is an amplitude diagram, and the resultants must be squared 
to give the intensity. Moreover, this calculation is for the ideal zone plate. 
If the actual zone plate has too narrow a central open space, this will 
contract the vector 0 a little, and thus move the polygon bodily to the left, 
still further reducing the gap with the origin. It is thought that the depth 
of the black line produced by this zone plate owes something to the slightly 
contracted core. 

If the total length of the vectors is summed without regard to phase, it is 
found to be 075 v units. This is by no means sufficiently large to swamp 
the initial 0 vector, as is required by the theory of paper I, even if the 
vectors are dead in line, as they will be at order 3*0. We do not, therefore, 
expect to get any very marked enhancement of the intensity at » = 3'0, and 
for higher orders the vector sum of the outer zones of this particular zone 
plate will be still less, and will therefore produce less and less effect on the 
initial vector 0, This explains the observed absence of high-order effects. 


4. Finite Linear Zone Plates 

It is of interest to consider here the effects likely to be produced by a 
finite linear zone plate, with a view to estimating the number of orders 
likely to be obtained from a given number of zones. It is convenient to 
take a type I zone plate, with zone edges given by the v values where 

bJ — i, 3» 5> 7> - • m etc. 



3i» 


G , L. Rogers 


It was shown in I that in this case the sum of the outer zones (leaving 
aside the central one) is 


s -“’%{v; + ^ + vi + vi + •' "“ c } 


A lower limit to this sum is 

obtained by grouping 2* terms of the first series into k sub-groups of 
geometrically decreasing value (approximately). 

This sum refers to the first order, but it is shown in I that its value in 
the »th order will be l/Vn of the above. This «th order vector is to be 
comparable with the central zone vector, whose length is roughly ijVl. 

On this basis, we find that a zone plate with 64 outer zones on each 
side would give orders up to » = 34, while one of 8 outer zones each side 
goes up to » = 3‘4. The latter result is in reasonable agreement with the 
last section. 


5. More Complex Objects 

So far the objects treated have been very simple. The next object 
tried was a finite wire or slat obstruction, calculations for which were 
available in routine optical lecture notes. The calculations were not very 
suitable: the hologram still closely resembled the original object, and the 
number of external "fringes" available was only three. The reconstruc¬ 
tions obtained are not regarded as very satisfactory. 

The problem of producing a continuous tone hologram was next 
studied. A set of calculations from the Fresnel Integrals was kindly done 
by Mr J. W. B. Laing for an object consisting of two wires or slats, of 
equal breadth, separated by a distance slightly (20 per cent.) wider than 
their breadths. The calculation was in a plane such that the double 
structure of the object was entirely obscured by diffraction, the pattern 
looking like that of a single obstruction to the untutored eye. 

The intensities thus obtained were plotted on a graph of I against the 
parameter v from the spiral. The plot was repeated below the v-axis to 
form a double symmetrical plot, and the outer regions were blacked out to 
form a pattern similar to a Phillips-Miller type variable area sound track 
(Wood, 1940). The conversion of this to a variable density record is 
readily achieved by a well-known dodge. The plot A (fig. 2) is set up with 
its /-axis vertical, and is uniformly illuminated. An image is formed on 



Artificial Holograms and Astigmatism 319 

the plate BD by a cylindrical lens, C, with its cylindrical axis, LL\ vertical, 
and its optic axis 00 ' (if one may call it such) horizontal and perpendicular 
to BD and A. The line BD is horizontal through O ’, and the centre of 
symmetry of A is at 0 . The intensity of light falling on any point P of 
BD is proportional to the height of the white element of area QR, which 



Jf'lO. 2 

corresponds to it optically in the cylindrical axis LL' of the cylindrical lens 
C (which axis is equivalent to the centre of a thin spherical lens: rays go 
through it without deviation). This assumes that the lens is sufficiently 
high to take in the extreme rays PQ, PR. In practice, it was arranged that 
the lens was ample, with a fair margin, so that a moderate strip of about 
J in., centred on BD, was all essentially of the right density for the variable 
density record. When printed with a contrast of 2, this is of course 
equivalent to a hologram. 

A cylindrical lens was taken from an optician's test set and stopped down 
laterally with a slit J in. wide (retaining its full height of 1 in.). It was 



320 


G. L. Rogers 


then mounted on the copying camera, in place of the usual spherical lens, 
with the cylinder axis vertical, and the optic axis horizontal. The plot was 
set up as above, and the focusing was done in blue light with a filter. The 
exposure was then made on an ordinary plate, sensitive to the blue only. 
This reduces chromatic aberration, and fits in with our standard processing 
technique. A step wedge is subsequently imposed on one edge, to control 
the contrast. 

In order to ensure that only the central portion was used, prints were 
made on paper, and the unwanted portions removed with a guillotine. 
Four such strips were pasted together in order to give the necessary height. 
It was then further reduced in size photographically to provide the hologram 
finally used. 

Fiducial marks were carried through the operation to indicate the 
length equivalent to 22*0 v units, and a measurement of these on the final 
hologram gives us the focal length, as with the linear zone plate. The 
theoretical focal length in mercury green is 72*7 cm. The hologram 
reconstructs the theoretical double slat pattern very well, at a focal length 
to mercury green of 69 ±6 cm. The agreement is quite reasonable. The 
Plate gives an outline of the steps in the process, and the final result. 


6. Astigmatism 

During this work we became aware of the bearing of this type of 
linear pattern on the problem of astigmatism. If a linear zone plate be 
arranged with its length vertical and its plane perpendicular to a horizontal 
beam of parallel light, it produces the normal bright line image in its focal 
plane. If, now, the hologram or zone plate be twisted about a vertical 
axis through an angle 9 , so that its plane is no longer perpendicular to the 
horizontal beam, it acts as a cylindrical lens of higher power. It will 
produce a line focus at a distance /cos 1 9 , where / is the normal focal 
length. If, however, the zone plate is originally horizontal, and not 
vertical, its focal length is unaffected by twisting about a vertical axis. 
Hence we may expect that with a generalized pattern, twisting the hologram 
will introduce an element of astigmatism of amount proportional to cos* 6 . 
Contrariwise, twisting a hologram with astigmatism present, such as may 
easily arise in the electron case, might conceivably be used to remove 
it. 

The cos* 9 law is to be expected on theoretical grounds. The twisted 
zone plate may be regarded as equivalent to a finer zone plate of shorter 
focus, obtained by projecting it on to the plane normal to the light beam. 



Artificial Holograms and Astigmatism 321 

The change of linear scale produced by this projection is L= cos 0 . It is 
shown in I that /«£.*, which explains the observations. 

A careful set of measurements is shown in graphical form in fig. 3. 



7. Production of a Hologram in an Astigmatic Beam 

In order to test the possibility of correcting an astigmatic hologram by 
mounting it skew to the incident light, it was necessary to produce an 
astigmatic hologram. A cylindrical lens was therefore placed in the 
filter-holder of the apparatus described in I, and this, together with the usual 
microscope objective, in this case of 1 in. focus, produced a much reduced 
image of the fine hole in front of the arc. The image was no longer a point 
image, but consisted of two focal lines, vertical and horizontal, very close 
together. The cylindrical lens was' unfortunately not very powerful 
compared with the microscope objective, and hence the separation was not 
very good. 

In order to get appreciable astigmatism in the hologram, the object had 
to be placed very close to the astigmatic pencil. The position of the 
vertical line, the horizontal line and the object were determined by a 
"location run" as described in I, but unfortunately this is not a very 
good method, and the main uncertainties of the work lie here. 

As object, a very much reduced picture of three fan-like devices was used, 
two fans lying vertically and one horizontally. They were designed to aid 



322 


G. L. Rogers 


precise focusing in two directions at right angles. The distance between 
the two vertical fans, a, was 1*455 times the distance, b, of the horizontal 
fan (tip) from this line. It is, of course, important to keep track of this 
ratio throughout the experiment. 

A hologram was taken in mercury blue light (4358 A.) with the vertical 
line 8*75 cm. from the plate, the horizontal line 8*62 cm. and the object 
8*15 cm. from the plate. The powers in dioptres in blue light calculated 
from these figures are 0*90 dioptres for the vertical line and o*66 for 
the horizontal line. Unfortunately, these powers are only obtained by 
differencing comparatively large numbers, of the order of iiw* -1 , and hence 
must be regarded as indicative only. Converted to mercury green they 
become 1*13 and 0*83 dioptres respectively. 

Straight reconstructions were made in green light using the hologram 
normal to the beam, and the observed powers were found to be 1*27 and 
0*96 dioptres respectively, in better agreement than we might expect. The 
hologram was then gradually twisted, and the best position judged by eye. 
It was decided that a good focus was obtained for all three fans when the 
deflection was 29 0 . The two fans were horizontal, as the power for the 
two fans (originally vertical in taking) is higher than for the one fan, and 
it is the latter which must be strengthened by twisting.' The angle 29 0 
implies a power ratio of 1: 0*76496 or, say, 1: 0*765. The ratio from the 
taking data is 1: 0*733 and from the reconstruction focal lengths 1: 0*755. 
The agreement must be regarded as satisfactory. 


8 . Lateral Scale Distortion 


In any work with astigmatic beams, the possibility that the scale 
or magnification of the image might be different in the two principal 
directions is a very real one. For instance, in taking the hologram, the 
projective magnification in a horizontal plane, due to the vertical line, is 
x 13*7, whereas that vertically due to the horizontal line (which is nearer 
the object) is x 18*3, giving an exaggeration in a vertical direction. Thus 
the original vertical/horizontal ratio of 1*455 is expected to come out at 


1 -45 5 


18*3 

x-Slsl* 


I 3'7 


95. From the hologram we get the ratio about 2 :1, this 


being difficult to judge as the detail of the fan points is lost by diffraction. 
The straight reconstructions both confirm this ratio of 2:1. 

The most interesting ratio is that of the compensated reconstruction. 
This gives a ratio of 1*77:1, which is very nearly 2 cos 29 0 :1, or 1*75:1. 
It is clear, therefore, that the skew projection produces a correction to the 



Artificial Holograms and Astigmatism 323 

scale factor of the right sign, but not of the right magnitude. It cannot, 
therefore, be used for compensating for a hologram involving any serious 
astigmatism. 

On the other hand, the astigmatic effects with this hologram are very 
marked, and hence it is possible that electron holograms, where the scale 
distortion and astigmatism is in any case small, could be corrected by this 
device in a manner acceptable in practice. 


REFERENCES TO LITERATURE 

Andrews, C. L., 1951. “A Correction to the Treatment of Fresnel Diffraction”, 
Amer. Journ. Phys ., xix, 280-384. 

Rogers, G. L., 1951. “Experiments in Diffraction Microscopy”, Proc. Roy. Soc. 

Editt., A, lxiii, 193-221. 

Wood, Alexander, 1940. Acoustics , Blackie & Sons, p. 508, fig. 18.5. 


ADDENDUM 

From the Department of Physics, Sir John Cass College, 

Jewry Street, Aldgate, London, E.C.3 

(Received December 14, 1951) 

Since taking up a temporary appointment at the Sir John Cass College, 
the author has had the opportunity of checking the artificial hologram 
discussed in the main paper on a non-recording microphotometer. The 
result is interesting in demonstrating both the points of agreement between 
the actual and theoretical holograms, and the points where the technique 
of production has failed. 

The results are given in fig. 4. The upper curve gives the uncorrected 
galvanometer deflection as the artificial hologram was scanned: the 
divisions on the horizontal axis are millimetres. The instrument is said 
by the manufacturers to give a linear response, but this is difficult to check 
accurately. Within the limits imposed by the small vertical scale of this 
diagram, however, the departures are not likely to be serious. 

The second curve is the theoretical curve, as calculated, and gives P 
against units in v. 

Comparing these two curves, we note at once that the dark regions of 




334 


G. L. Rogers 


the artificial hologram are not by any means as dense or obstructive to the 
incident light as the theoretical curve requires. The contrast of the first 
light fringes from the centre is exaggerated, that of the second (the real 
maxima of the pattern) is depressed, and the rest is comparatively well 
copied. But when the comparatively large distance of the variations from 
the zero axis is taken into account, it will be seen that the contrast is low 
compared with the theoretical. 



It is suggested by Gabor (Gabor, 1951) that the high level of this 
artificial hologram is not serious, but is equivalent to adding a uniform 
background to the reconstructed wave. 

It will also be noticed that at the ends of the regular curves of the top 
figure, certain irregularities are observed. A, A' give the level of the card 
used as the background to the mounted photographic prints. B, B' are 
small shadows cast by the edges of the photographic prints, as they lie 
above the card. The single peak, C, is due to slight glare from the surface 
of the print at one side in this region. 

The lowest curve is a microphotometer trace across the negative of the 
reconstructed image. On the negative, the two peaks in transmission 
correspond to the two opaque bars postulated in the calculations. The 
two small marks at the top of this diagram give the theoretical position of 
these bars. 




Artificial Holograms and Astigmatism 325 

The asymmetry of this plot is not fully understood, but is connected 
with the fact that the intensity across the reconstructing beam was not 
uniform. The magnitude of the effect is, however, surprisingly large. 

Part of the diffraction pattern of the unwanted secondary object is 
discernible in the background. The units along the horizontal axis are 
once again millimetres. 


Acknowledgments 

I have to acknowledge the encouragement and assistance of Professor 
Preston at Dundee, and of Mr R. H. Humphry at the Cass College. I am 
also grateful to the Principal of the Cass College for arranging a temporary 
appointment and giving me laboratory space during a gap in my normal 
employment. 


REFERENCE 

Gabor, 1951. Private communication. 


{Issued separately .higust 23, 1^52} 






Pt ” c - K °i .W. Ed,,,., 


Voi. i.xin 




G- L, Koi'.kks 


Folate 




( 3*6 ) 


XXIII.— Studies in Practical Mathematics. VII. On the Theory 
of Methods of Factorizing Polynomials by Iterated 
Division.* By A. C. Aitken, D.Sc., F.R.S., Mathematical 
Institute, University of Edinburgh. 

(MS. received November 30, 1951. Revised MS. received January 24, 1952. 

Read May 5, 1952) 


Synopsis 

The division of one polynomial by another I9 studied with the object of ascertaining 
the errors produced in the coefficients of successive remainders by small errors in the 
coefficients of the divisor. It is shown that the matrix which effects this transformation 
of errors is a polynomial in the rational canonical matrix for which the divisor polynomial 
is characteristic. The theory gives rise to a numerous class of iterative processes for 
finding an exact factor, such as the extant method based on the penultimate remainder, 
Bajrstow’s iterative method of finding a quadratic factor, and many others. Some new 
suggestions are made for accelerating convergence. 


I. Introductory 

In an earlier paper (Aitken, 1950) the author examined the theory of a 
method first proposed and used by S. N. Lin (Lin, 1939, 1941) for approxi¬ 
mating by repeated division to an exact factor of a given polynomial. 
This was called the method of the penultimate remainder (p.r.). It is 
shown below that there are many such methods, though that of the p.r. 
would seem to be the simplest; and occasion is taken to give a comprehensive 
theory for these methods. 

Note ,—It calls for remark that Dr Lin almost immediately superseded the method of 
the penultimate remainder (not so named by him) in favour of a second and in general 
more rapidly convergent method (Lin, 1941, p. 241, and 1943) which stands to the former 
very much as Seidelian iteration for the solution of simultaneous linear equations stands 
to simple iteration. It is this later method that now bears the name of Lin’s method. The 
theory of its convergence is less straightforward than that of penultimate remaindering 
and will be reserved for a separate communication. 


2. The Transformation of Small Errors in Remainders 

It is well known and evident, yet important for the present purpose, 
that remaindering with respect to an arbitrary divisor, say d m (x ) 9 is a 
linear operation, in the sense that if fix) and Fix) are arbitrary polynomials 

# ThiB paper was assisted in publication by a grant from the Carnegie Trust for the 
Universities of Scotland. 



VII. Methods of Factorising Polynomials by Iterated Division 327 


which are being divided by d m (x), and if respective remainders r k (x) and 
R k (x) of the same degree h are reached at a certain stage (not necessarily 
the final stage) in these divisions, then the remainder of degree h arising 


from division of \f(x) +fiF(x) by d m (x) is Ar A (*) +p.R k (^x). 

Let, then, the polynomial 

f n (x) +«,*"-*- . . . +(-)”a n (1) 

-(*-<h)(*-aj . . . (x-a„) (2) 

have an exact divisor d m (x) of degree m < n, where 

dJx)-x'*-b x x'*-'+b&~-'- . . . + (-)»•*„ (3) 

“(*-&)(*-&) . . . (x - f}„), (4) 

the quotient in this division being 

?„-*(*) - *"- m - C!*"-"- 1 + V*—■ - . . . + (- )"- B W (5) 


Suppose, however, that we divide f n (x) not by d m (x) but by t m (x), 
which, except for small errors e, in the coefficients of its terms later than 
x n , would be the same as d n (x)\ then errors will be produced in the 
coefficients of the successive remainders. To find these, let x‘d„(x), 
where s is an arbitrary integer, be divided by 

-* m ~(^1 + « i )*” -1 + (b t + - . . . + (-)"(£* + O - (6) 

The first remainder is evidently 

+ _ . +(-)™- 1 € nl JC J . (7) 

We shall associate its coefficients with the vector of errors in the coefficients 
b t of /„(*), namely c={fj«» . . . «J. The next remainder in this division, 
if terms of higher than the first degree in the errors are neglected, is seen 
to be 

(ta - «*)*•+"-* - (Vi - (8) 

If, therefore, the error-vector at this second stage is {«/«/ • • • <»» / }» we 
have 


~ b 

1 - 1 • 


• 






t — 


1 » ” 1 


■ 


< 

1 


*1 

# 


1 

I 

! 



1 



/ 



i 

i 

- I 





1 


_b 

m 


• _ 


_ C 

m _ 



t 

n _ 




328 


A, C. A it ken. Studies in Practical Mathematics 


Thus B is the matrix that transforms the error-vector in a remainder at one 
stage into the error-vector in the next remainder. The error-vector at the 
kth stage of this particular division will be B*- 1 *. In the theoiy of 
matrices B is familiar as a rational canonical form , except that more 
commonly B\ its transpose, is taken as canonical. Its characteristic 
polynomial is (~) m d m (x), and so its latent roots are the zeros ft of d m (x). 
Its latent vectors are readily seen (in the case of distinct latent roots) to be 



where [h] denotes that ft is supposed in all the indicated summations and 
in the last product in the bracket. If the elements in these latent vectors 
are given alternate plus and minus sign, they appear as coefficients in the 
divisor polynomials d m (x)l(x - fi h ) of degree m- i, such as figure, for 
example, in Lagrange's interpolation formula. 

3. The Reducing Transformation 

When a remainder of degree m is used iteratively as a divisor, it i9 
convenient to ensure, by dividing through by the coefficient of its highest 
term, that this term shall become x m . We therefore examine this reducing 
transformation of errors, produced when a remainder, which for convenience 
we shall write in the form 

+ • . • +(-)"(*« +O}, ( 1 ) 

is divided by its leading coefficient k(i 4 * 0 - Carrying out this division 
and neglecting, as before, error terms of higher than the first degree, we 
find that the error-vector has been transformed by 



Now this, except for the scalar factor i~ x t is merely B t extended consist¬ 
ently by a further column. However, if the remainder is penultimate the 
coefficient in its last term has no error; for it is simply the constant term 
of the dividend f n {x) "brought down". Therefore in reducing a penulti¬ 
mate remainder the last column in (2) is unnecessary, and the reducing 




VII. Mtthods of Factorizing Polynomials by Iterated Division 329 


transformation is - k~ l B, acting on a vector of m elements. We may note 
similarly that in an antepenultimate remainder the last two coefficients 
would be free from error, and then the reducing transformation would be 
-k~ l B with last column removed. Such reducing operations play an 
important part in the theory of iterated remainders; and so it will be 
convenient to use the notations B(m, m + i) and B(m, m -1) for the 
matrices obtained by extending or diminishing B by a column, and to use 
a similar notation for the extensions or diminutions, consistently applied, 
of powers of B. At the same time let us denote the " B" corresponding 
to a divisor (x - v)d m (x) by B t , so that 



*b x + v -x 



b % + vb x . -1 

* 


b % + vb t - 1 


if,- 


| 

i 

1 

- I 


L ... 

• 


Then it is easy to verify the following results: 

B ( m , m +1 )B — B *( m , m +1), B ( m , m + 1 ) B , ^ B * + 1 (. m , m +1), 

B(m, m +1 )B'(m + 1, m)—B ,+1 . (4) 

The important feature is that B(m, m + l),asa premultiplier, obliterates v. 

4. Application to Penultimate Remaindering 

The above results give the theory of iterated division and remaindering. 
In particular, for the method of the penultimate remainder, since 

• . . +(-)*-*^_m}rf m (x), (1) 

the exact penultimate remainder with respect to the divisor d m (x) is 
( _ )*~ m c n . m d m (x). Now by § 2 (9) and the linear nature of remaindering, 
the errors in the penultimate remainder with respect to t„(m) will constitute 
the error-vector 

. . . + ( - (a) 

and on this the reducing transformation, by § 3 (2) et seq., will impose the 
further transformation ( - ) n ~ m ~ 1 c„} n B. Thus the vector of errors in the 
reduced penultimate remainder (r.p.r.) will be Re, where 

R-(- ( 3 ) 




330 


A. C. Aitken, Studies in Practical Mathematics 


The condition for convergence of the iteration is therefore that all the 
latent roots of R, namely each 

P = (- ~ ‘ ( 4 ) 

where p is a latent root of B, that is to say a zero of d m (x), shall be such that 
| p | < i. The rapidity and nature of the convergence will depend on the 
root or roots of maximum modulus. 

These results were implicit in our former paper (Aitken, 1950), but 
were not clearly perceived as admitting the present formulation. They 
give a more significant expression to the essential basis of the method of 
iterated r.p.r., namely, that apart from sign the error-transforming matrix 
R is a polynomial in B which exactly copies the reduced penultimate 
quotient. 

The conditions of convergence involve the divisor d n (x), which is not 
known exactly, and which it is the very object of the method to ascertain. 
Consequently, the nature and rapidity of the convergence can be deter¬ 
mined in practice only approximately; but this is a limitation inherent in 
every method that proceeds towards an unknown solution by successive 
approximation. 


5. Alternative Iterative Processes 

At this stage many other less simple iterations, also based on remainders, 
suggest themselves. For example, we might divide /«(*) by t m (x), 
stopping at the antepenultimate remainder, of degree m +1; next, taking 
this, reduced, as a new divisor, we might divide f n (x) again up to the 
ultimate remainder, which will be of degree m, and reduce. This two- 
stage process could then be iterated. We find without trouble, by the 
linearity of remaindering and § 3 (4), that the error-transforming matrix for 
these two stages together is the product of two polynomials in B, the first 
copying the reduced antepenultimate quotient in the first division, the 
second copying the reduced ultimate quotient in the second division. In 
fact a general theorem, probably only of theoretical interest, may be 
enunciated at this point. 

Let /hOO be divided by /„(*) up to any remainder of degree < m. 
Let the remainder, reduced, be divided again into /„(*) up to any re¬ 
mainder of degree < m. Continue thus, but so that the last division in 
such a sequence of operations gives a remainder of degree m. Then, 
under conditions of convergence, such a cycle can be iterated to give, in 
the limit, d m (x). The error-vector in the last reduced remainder is in fact 
Re, where, with suitable sign, R is a product of polynomials in B which 



VII. Methods of Factorising Polynomials by Iterated Division 331 

exactly copy the successive reduced quotients, partial or ultimate as the 
case may be, that would arise if the divisions were exact. 

It is hardly necessary to give a numerical illustration of the process 
of antepenultimate remaindering. Even though its two-stage cycle may 
often show more convergence than two consecutive stages of penultimate 
remaindering, the latter is the simpler process; and modern desk machines, 
to say nothing of automatic machines working to a coded programme, 
have so reduced the time and tedium of computation that a simple method 
such as penultimate remaindering, or Lin’s later method, provided that it 
is reasonably convergent, is to be preferred to a more complicated method, 
even if the latter is more rapidly convergent. Some have questioned 
whether penultimate remaindering and Lin’s later method are as useful in 
practice as the time-honoured Newton-Raphson iteration or its extension 
(Bairstow, 1920) to the case of a quadratic factor; for these can be carried 
out in the same way, admittedly in two stages for each iteration, by Homer 
processes of division and remaindering. The answer probably is: that 
for purposes of programming, the method of r.p.r. is the simplest of all, 
and Lin’s method almost as simple. Therefore it will be of value to 
see whether the convergence of these methods can be accelerated in any 
convenient way. 

6 . Accelerative Processes: Quadratic Convergence 

In an earlier paper on this subject (Aitken, 1950) certain devices, 
already known to be serviceable in other applications, were applied to 
enhance the convergence of penultimate remaindering. We now survey 
some other possibilities of the same kind. 

Consider first the ultimate remainder in the division of /„(*) by t m (x). 
Here we easily find, in the notation and by the theorems of §§ 2 and 3, that 
the error-vector of the coefficients in the remainder is q n - m (B) a. Thus, 
in theory, if we operate on these remainder-coefficients with 
we shall obtain, apart from terms of higher than the first order, the correc¬ 
tions to be applied to the coefficients of t m (x). In practice we shall not 
know y«_*( 5 ) exactly, and therefore shall have to use the approximation 
to it provided by the work-sheet. The application of this correction, for 
small values of m, is equivalent to extant methods. For example, when 
m — 1, the correction is precisely the classical Newton-Raphson one; when 
m = 2, the two correction terms obtained by applying the approximate 
to the two coefficients in the remainder are precisely Bairstow’s 
corrections (Bairstow, 1920, pp. 558-560) for the case of a quadratic 
divisor. For completeness and for later comparison, though there is no 



332 


A. C. Aitktn , Studies in Practical Mathematics 


essential novelty here, we shall illustrate by examples the appearance of the 
work-sheet when two successive divisions are used to effect these corrections. 

Example J.—To improve the approximation x= 1*75 for a zero of the fourth Laguerre 
polynomial x 4 - i6x* + 72X* -96* + 24. 



1 

16*o 

720 

96*0 

24*0 

1 i *75 

■ 



13-6406 

0*1290 

-0*00424 

■ 



- 304375 


1 1-74576 

1 

14*25424 

47*11552 

1374761 

- 0*00003 


Here the correction, namely -0-1290/30*4375= >0-00424, is the Newton-Raphson 
one, the divisions being the usual Homer processes for obtaining by the remainder 
theorem a functional value and a derivative. 

Example —With the same Laguerre polynomial, to improve the approximation 
x % - 6x + 8 for a quadratic factor by Bairstow's corrections. 



1 

16*o 

72*0 

960 

24-0 

1 6*0 

8*o 

1 

10*0 

4 *o 

- 80 

1 

00 

6 



1 

4 *o 

- 40 



0*267 

-0*133 



-a 5 *o 

-320 

A = 240*0 

1 6*267 

7867 

1 

9733 

3*363 

- 0-2247 

- 0-6733 



1 

3-466 

- 4-7307 



001546 

0-05319 



-36-4521 

- ey *2670 

A =219-644 

1 6*28246 

7.92019 

1 

9-71754 

3*02975 

0*00095 

O-OO38O 


(The actual factor is 6*28238.1;+ 7-91985.) 

The four italicized entries at the second stage, for example, arise from dividing the 
quotient of the initial division once again by the divisor x % - 6-267* + 7*867. It is easy to 
prove that they furnish the approximate elements of Y) and (provided that we divide 
by the determinant A) of its reciprocal, thus: 


r- 26-4521 

3-466 - 

j r - 47307 

- 3 466 -1 

L - 27-2670 

l 

- 4 - 7307 - 

reciprocal -7— 1 

..<*« i_ 2rMJO 

- 26-4521J 


The elements of the reciprocal operate on the errors (-0*2247 -0*6733) t0 IP ve the 

corrections that are shown at the left, {0*01546 0*05319}. All that is necessary on the 
work-sheet is to enter the value of A ; the computer will easily see how the multipliers con¬ 
sort with the errors, and which one in each pair of multipliers receives the change of sign. 
A similar routine holds for every stage. The convergence, if the procedure is strictly 
carried out, is quadratic, in the sense that each successive vector of corrections involves 
only squares, products and higher powers of the elements of the preceding error-vector. 
However, just as in the Newton-Raphson case (Whittaker and Robinson, 1944, p. 91), 
when satisfactory approximation » being attained a modification is possible; the matrix 
operating on the corrections, itself tending all the time to [q t (B)}~ l t is then so stable that 
h need not be re-calculated. In this way, sacrificing quadratic but securing a good 










VII. Methods of Factorizing Polynomials by Iterated Division 333 


linear convergence, we use a constant matrix and constant A and thus save the second line 
of division at each stage. It is not necessary to exemplify numerically this very convenient 
alternative procedure. 

The Newton-Raphson correction and Bairstow's corrections are based 
on errors in final remainders. When we investigate the corresponding 
adjustments derived from errors in penultimate remainders, we find that 
they exist and are effective, but are more complicated to apply. 

For example, with the same work-sheet as in Example I, the second approximation 
would be computed thus: 

(24 x 30*4375)/( 13-6406 x 30-4375 +0-1290 x 25-1875) = 1-74579. 

Since the actual root is 1-745761, we see that the Newton-Raphson process, which gave 
1-74576 for second approximation, is not likely to find a rival in the above. 

However, just as there is a modified Newton-Raphson or Bairstow 
procedure based on constant multipliers, so there is a modified procedure 
based on r.p.r., which can be achieved in the following convenient way. 
Suppose we are trying for a linear factor the exact value of which is 
x - b . Let us multiply the dividend /„(*) by x - c t where c is so chosen 
that the penultimate quotient arising from the division of (x - c)f n (x) by 
x -b shall vanish for x =b, that is, shall contain x-b as a factor. A simple 
application of the remainder theorem then gives c as follows. Let the 
quotient which arises from the division of f n (x) by x-b be once again 
divided by x - b, exactly as in the second stage of a Homer transformation 
or a Newton-Raphson process; then if d n , and d n - x are the two last 
coefficients in the second quotient, c= —dn-Jd^. This is the theory; 
in practice our divisor is only an approximation to x-b, and therefore 
we shall have only an approximation to c . 

Example j.—The same problem as Example 1. We take the two last entries in 
the line showing the second division, and construct * = 3o-4375/25-i875 = i-20S, or 1-2 
approximately. We therefore multiply the Laguerre polynomial by x- 1-2, and proceed 
with r.p.r. 



I I 7’2 

91-2 

182*4 

139-2 

28-8 

1 i -75 

» 15*45 

64*162$ 

70*1156 

16*4977 

288 

1 1-74570 

1 1-745761 

1 15-45430 

64*22143 

70*28865 

16-49710 

28-8 


A good convergence is visible; in fact the successive errors tend to a ratio of about 
1 .*340. On the other hand, a rougher value of c , such as 1-25, would have produced 
good but not striking convergence. 

It is to be noted that ordinary r.p.r., applied with this particular 
divisor to the Laguerre polynomial, would diverge. Multiplication by 




334 


A. C A it ken i Studies in Practical Mathematics 


x-c prepares the polynomial for the application of r.p.r., removing 
divergence and ensuring good convergence. It is necessary that the value 
of c should be rather close to the ideal value; that is, that the approxima¬ 
tion to the linear divisor should not be too rough. 

The device can be extended to divisors of higher degree than the first. 
For example, in the case of a quadratic divisor there exists a multiplier 
x*-cx + d such that the penultimate quotient, when (x % - cx + d)f n (x) is 
divided by this divisor, contains the divisor as a factor. This condition 
determines c and d as solutions of two simultaneous linear equations with 
coefficients appearing very naturally in the line showing the second 
division on the work-sheet; but we shall go no further into detail here. 


7. Concluding Observations 

Some remarks of a general nature may be appended. It was pointed 
out in the earlier paper that, so far as division by a linear factor is concerned, 
penultimate remaindering is not new; it is merely ordinary iteration, with 
a particular way of breaking up the polynomial equation into two parts. 
Equally, and in a dual respect, when the divisor is of degree n- 1, the 
process is not new; for the coefficients of the divisor form the vector 
b = {l bx 6 % . . . b n _ j), and if A is the rational canonical matrix (in the sense 
of the present paper) corresponding to the dividend (not the divisor), then 
penultimate remaindering is just a somewhat disguised way of building the 
vector sequence b f Ab, A*b, A 9 t, . . and this is a well-known way of 
approximating to the dominant latent root of A. It may be contrasted 
with the formation of the vector-sequence «, A'u , (A')*u, . . ., which is 
equivalent (cf. Fry, 1945, p. 104) to the old method of D. Bernoulli 
(Whittaker and Robinson, 1944, p. 98) of building up a sequence of values 
ut satisfying the difference equation with constant coefficients, 

#l+»-*!««+!!-I + 


to which the polynomial is auxiliary. 

Most of the papers included in the list of references given below are 
concerned not with penultimate remaindering but with Lin’s later method. 
E. H. Sealy (to whose work reference should have been made in our 
previous paper) derived criteria for convergence of r.p.r. in the case of 
linear and quadratic divisors, and used accelerative processes of the usual 
kind. B. Friedman proposes a method based on iterated division, but 
using quotients, not remainders; we shall examine elsewhere the trans- 



VI 1 . Methods of Factorising Polynomials by Iterated Division 335 

formation of errors in his method. Luke and Ufford extend Lin’s method 
in the direction of resolving f n (x) not into two but into three or more 
polynomial factors, and they tabulate formulae useful for this purpose; 
naturally these formulae are more complicated than the simple bilinear 
recurrences of synthetic division. 


REFERENCES TO LITERATURE 

Aitken, A. C., 1950. Proc. Roy. Soc. Edin ., A, lxiii, 174-191. 

Bairstow, L., 1920. Applied Aerodynamics , Longmans, Green & Co., Ix>ndon. 
Friedman, B., 1949. Commun. Pure Appt. Math. y n, 195-208. 

Fry, T. C., 1945. Quart . Appl. Math., in, 89-105. 

Lin, S. N., 1939. Thesis for Sc.D., Massachusetts Institute of Technology. 

-, 1941. Journ. Math . Phys. r xx, 231-241. 

-, 1943- Journ . Math. Phys., XXJ, 60-77. 

Luke, Y. L., and Ufford, D., 1951. Journ. Math. Phys., xxx, 95-109. 
Sealy, E. H., 1940. Thesis for Ph.D., University of Edinburgh. 

Whittaker, E. T., and Robinson, G., 1944. The Calculus of Observations , 
4th cd., Blackie & Sons, London and Glasgow. 


(Issuedseparately August 23, 1952) 



( 336 ) 


XXIV.— The Solution of a Functional Equation.* By A. H. 
Read, M.A., United College, University of St Andrews. 
Communicated by Dr D. E. RUTHERFORD. 

(MS. received September 25, 1951. Revised MS. received February 29, 1952. 

Read May 5, 1952) 


Synopsis 

Analytic solutions of the functional equation/[a, = in which f{e, w) and 

g{s) are given analytic functions and tfz) is the unknown function, are investigated in the 
neighbourhood of points f such that ^(C)=C- Conditions are established under which 
each solution ^(r) may be given as the limit of a sequence of functions defined by 

the recurrence relation x (z) =/[*, ^«{g(*)}], the function ^ x {e) being to a large extent 
arbitrary. 


1. Introduction 

The solution of functional equations of the type 

/[*, 0 ) 
where fa) is the unknown function, is closely related to the theory of 
iteration. The iterates of a function fa) are the functions g n (js) defined by 
the relations 

g\(*)-fa), gfa)~g{gn -(«- 2 , 3 ,...). 

If this sequence of functions converges for a fixed value of t, and if g(jf) 
is continuous, the limit { clearly satisfies the equation g(C) — £- If { 
satisfies this equation it is called a double point of the function fa), and 
such double points are of particular importance in connection with 
functional equations of the type (i). 

Solutions of (1) have been given in certain special cases. In particular 
Koenigs (1884) studies Schroeder's equation . 

<Hfa)}-*fa), (a) 

where fa) is a given function and s a given constant. He shows that, 
under certain conditions, 

Km 

exists in a region containing a double point £ of g(*) 9 and represents an 

* This paper was assisted in publication by a grant from the Carnegie Trust for the 
Universities of Scotland. 



The Solution of a Functional Equation 337 

analytic solution of (2). Fatou (1920) considers the more general equation 

a (>*)$&)) (3) 

in which A(t) and g(e) are given functions. He shows that, subject to 
certain conditions, the infinite product 

A(b) ft A{g^t)) 

rm I 

converges and represents an analytic solution of the equation (3) in the 
neighbourhood of the origin. 

These solutions exemplify a generalized iterative process which can be 
applied to equation (1). Let the functions be defined by the recurrence 
relation 

*r + l(«)-/[», trigm, (4) 

where /(*, w) is a regular function of the two variables e and w. Then if, 
when s lies in some domain which contains also g(z) t tends to a 
regular limit function <f>(z) as n -► », will be a solution of the equation 
(1). It is the object of the present paper to establish conditions for the 
convergence of the sequence The principal result is formulated in 

the theorem 'of § 4. The author wishes to express his indebtedness to the 
referees for a number of simplifications in the argument. 


2. Preliminary Lemmas 

Lemma i.— If | k\ < 1, c n =u n+ i~ku n and c n -*-c as »-*■«, then u n 
tends to a limit as w -* «. 

To prove this, write w*=« n -c/( i-k). Then the equation 
c n =u n + x -ku n reduces to 

Suppose that « > o. Choose N so that | e n - c \ < e when n> N. Then 

rtf-i «-i-| 

v H - vj n + 2 + 2 fc- 

L rmO rmlfj 

and therefore 

lim | v n | < e fim £ I * < Z m - 

n-+n u-+m fmN 1 “ I * I 

It follows that t/,,-^0, and so -k), as «-►«. 



338 


A. H. Read 


Lemma 2 . —If I a, I < M/R T , I b T I < JCjp T , for constants M, K, R 
« / • \» ” 

and p, and the series 2 aJ 2 bfg T ) is rearranged as the series 2 J c »* n > 

«-l 'r-l / H-1 

then 


. , MX (R + K\* 

u ‘ ]< rXk\-W)' 


(It is well known that this rearrangement does not alter the sum of the 
series provided that | s | is sufficiently small.) 

To prove this lemma we observe that | c n | does not exceed the co¬ 
efficient of z n in the rearrangement, as a power scries in e, of the repeated 
series 



This repeated series is easily found to have the sum 


MKz 

Rp-(R + X)t 

and, by expanding this expression in powers of 2, the required coefficient is 

MK tR+K> 

R + X\ Rp ) ’ 


3. The Equation /[*, < f >( z )]=< f >( se ) 

It is convenient to discuss the equation 

/[*,#*)]-#«> (5) 

before considering the more general type (1). We shall write |r| = cr, 
and we shall suppose that 0 > I. Suppose also that the equation/(o, w)—w 
has a solution p, and that f(s, tv) is a regular function of the two variables 
(a, w) in the vicinity of the place (o, p). We may then write 

flb 

/(», w) - 2 - P)*- (6) 

M-o 

We assume in this section that o > | a 0l |. 

Let <f>i(s) be any function which is regular at the origin, and takes the 
value jS there; and define the functions 4 >fs) successively by 

<k+i(")-/l>. ( 7 ) 

In this section we shall show that, for each value of a in a neighbourhood 
of the origin, <}>*(*) tends to a limit as * -> ®, and that this limit is a regular 



The Solution of a Functional Equation 339 

function of s. We firet show that, when <f> H (s) is expanded in powers of m 9 
each of the coefficients in the expansion tends to a limit as »-► ». 

Since ^(*) is regular at the origin, all the succeeding functions ^ B (s) 
are regular there, and we may write 

4>n(*) - 2 a »W®’ 
pmO 

From the fact that ^ 3 (o) = j 9 , combined with (7), it follows that <f> n ( o) = fl for 
all n\ that is to say, oq(») = j 3 for all n. Therefore, from (6) and (7), 

2 2 <**>(»)**) • w 

f , 1*0 ' j >-1 * 

The triple series on the right is absolutely convergent for sufficiently small 
values of 2, and can therefore be rearranged in ascending powers of m. Thus, 
equating coefficients of 1", 

+ 0 - <*01 a m(«) + A, (9) 

where P n is a polynomial in those a r (w) for which r <m. The co¬ 
efficients in P n are independent of n. 

We now make the induction hypothesis that, for each value of r less 
than m, a T (n ) tends to a limit a r as «-*• ®. This hypothesis is verified in the 
case m —1, since we have shown that a 0 («)=/9 for all n. From the hypo¬ 
thesis it follows that P n tends to a limit a sn-*■<*>. We now deduce from 
( 9 ). by putting c n =P n /s m , k=a oi ls m , u„ = a m (n) in Lemma 1, that a„(») 
tends to a limit as n -*• ®. 

We next show that there exist numbers A and p such that | a r (») | < Xp~ r 
for all n. Let M be the maximum value of | f(z, w)-fi | when z, w are 
within circles | z \ < R, \ w - ft | < T. Then, by the extension of Cauchy’s 
inequalities to analytic functions of two variables, 

| a H | < MR-'T-K ( 10 ) 

Also, if M t denotes the maximum of |/(o, tv)-^ \ on \tv\ = T, then 

| a 01 1 - lim MfjT. 
r -+ 0 

Hence, given c > o, there exists rj such that | M r /T -1 a n | | < «/2 if T < ij. 
Having chosen a value of T less than rj, we can find R such that 
| MjT-MfjT | < */a. Then 


MIT -1 a a | <| M/T-Mf/T | + M t IT -|«»| < c. 



A. H. Read 


Since a > | a n |, it is therefore possible to choose R and T so that 

MjT < a. (n) 

Let X-T(a-i), (») 

and choose p so that p< R(J (lj) 

Mp MR 

XRo + 7\Ra -~p) < 1 <I4) 

(this being possible by (ii) since the left-hand side of (14) tends to MjTa 
as p ~► o), and, for all r > o, 

A p-* > | a r (i) |. (IS) 

Since Ea r (i)* r is a power series with non-zero radius of convergence, this 
last condition can always be satisfied by suitable choice of p. Moreover, 
if we take for <f> 1(5) the simplest permissible function, namely the constant 
{}, it can be omitted altogether. 

We now make the induction hypothesis that 

| Or(/) | < Ap~ r 

for all r > o and p <n. The hypothesis is satisfied in the case »= 1 in 
virtue of (15). By Lemma 2 the coefficient of * m ~ r in 


has modulus less than 


oe / m 

242 

1-0 \ n -i 




MX / T+Xy-' 
R T (T + A)\ Tp ) 


provided that m > r; and, using (12), this expression may be written in 
the form 

MX 

R'rXp) ' 

Therefore, picking out the coefficient of e m in (8), we obtain 


M MRX 

I I < jfmgm + T ( Ra _ p) p < 


{-(£.)•] 


pA-MKRaj 


T(R<J-p)_ 


^ XT Mp MR 

< p*lXR<r + T(Ra-p). 

< A/p*, 



341 


The Solution of a Functional Equation 

where in the last four lines we have used (10), (13) twice, and (14). Hence 
by induction | a fn) | < Xp~ r for all « and r > o. Therefore 

I Or(*)* r I < I ** T P~ T I -M r , say, 

where M t is independent of n and XM r is convergent if | e | < p. Hence 
on account of uniform convergence with respect to n, 

ton 2 0 r(»)® r - 2 °^ r » ( 1*1 < P)> 

r r 

since each term of the series on the right is the limit, as of the 

corresponding term on the left. Consequently lim ^«(s) exists and is a 

regular solution of the functional equation (5) in the circle | s \ < p. 

That this solution is unique can be seen as follows. If $e) satisfies (5), 
then ff>(o)—p where /(o, j 8 )=/ 9 . Therefore, if <f>(e) is regular at the origin, 
we may write 

<£(*) “P + 2<M r , 

r-1 

and, using (6), we must have 

P + 2 <V r * r - 2 o rt t r ( 2 “«**) • («6) 

r-1 r,i-0 x *in-l / 

Equating coefficients of s" we obtain 

a«(i fl -tfoi)-#(«)> (i 7 ) 

where ff(a) is a function of a 1( a ( , . . ., a w _j. Since we arc assuming that 
o > | 0 O1 I, we cannot have a" -a 0l — o for any value of n. Therefore the 
coefficients a n are uniquely determined in succession by equations of the 
type (17). Accordingly there cannot be more than one solution ^(a) which 
is regular at the origin and takes the value P there. 

It is desirable to know how far the condition a > | a 0l 1 is necessary for 
the truth of the results established in this section. If we remove this 
condition it is no longer true that the coefficient of $ m in <!>»(*) must tend to 
a limit as »-*■ «o. This is seen by taking a particular example: if we try 
to solve the equation 

by taking ^j(a) to be the constant we find that the coefficient of a in 
tends to - « as n -*■«. 

It may nevertheless be true that the limiting process converges if the 
function ^(a) is chosen with more care in such cases. It is not indeed 
difficult to show that (a) can always be chosen so that each of the 



343 


A. H. Read 


coefficients in tends to a limit; but it is an open question whether 
the power series determined by the limiting values of these coefficients is 
convergent. 


4. The Equation /[*, <f>{g(s)}]=<K*) 


We shall prove the following theorem: 

Let £ be a double point of the function g(g) such that g(s) is regular at 
£ and o < | g'ifj) | < 1; and let be a solution of the equation 

/(£, w)-w 


such that (i) f(s, w) is regular in a domain of the place (£, / 3 ), and 


00 


&w 


\g'<£) 


when g =£, w=fi. Then the functional equation 


(1) 


has just one solution <j(ss) which is regular at £ and takes the value /J there. 
This solution can be given as lim <£„(*), where the functions <f> n (g) are 

defined by the recurrence relation 


4>n+l(*)-f[e, (18) 

and pfg) is an arbitrary function regular at £ and taking the value fi there. 

We shall write g'(0 = l/s, where evidently | s | > 1 . By the work of 
Koenigs referred to in § 1, there exists a function h(a), which is regular «t 
£ with A'(£) ^o, and which satisfies the equation 


(19) 

in a neighbourhood of £. For such a function we clearly have A(£)=o; 
and, since A'(£) 9*0, the inverse function of h(e) possesses a branch A_ x (s) 
which is regular at the origin and takes the value £ there 
The transformation 


Ze-A^sZ) (so) 

gives a one-one mapping of a neighbourhood of £ in the' s-plane on a 
neighbourhood of the origin in the Z-plane. From (19), 

h{g(e)}-Z, f(«)-MZ). 


(ai) 



The Solution of a Functional Equation 343 

Consider a pair of functions <b(Z) defined in the neighbourhood of 
£ and o respectively, and so related that 

<D(Z)-#*(«)}. (2a) 

Then <I>(Z)=^{A_ 1 (Z)}, and therefore, using (20), 

( 23 ) 

Conversely, if the functions satisfy (23), then <1 >(sZ) =whence we 
deduce (22). Thus the functional equations (1) and 

f\.h_ x (sZ), <!>(Z)]-<b(sZ), (24) 

considered in the neighbourhood of £ and the origin respectively, are 
equivalent: the existence of a solution to one implies the existence of a 
solution to the other, the solutions being related by (22) and (23). 

Now let the functions 0 „(Z) be defined by 

<&»('£)-&.(*)» 

so that by (22) we have also ® n (Z) =$*{£(?)}. Then, from (18), 

But the function f\h-\{?Z), w] is a regular function of the two variables 
( Z , w) in a domain of the place (o, / 3 ), and it takes the value j8 at this place. 
Moreover, when this function is expanded in powers of Z and (w - p), the 
coefficient of (w - / 3 ) is 

p/(a, M>) ~j 

L dw J«-f. 

which we know to have modulus less than | s |. By the work of § 3 it 
follows that, in a neighbourhood of the origin, 0 M (Z) tends to a limit 0 (Z ); 
and that $(Z) is the unique regular solution of (24) which takes the value 
P at Z=o. Since <f> n (e)=Q n (sZ) for all «, therefore 4 > n (e) tends to a limit 
^(*) which satisfies <}>(b)= ( &(sZ). Consequendy, owing to the equivalence 
of (1) and (24) which we have already remarked, <fr(g) is a solution of (1) 
which is regular at £ and takes the value /9 there: and it is the only such 
solution. This completes the proof of the theorem. 

5. Region of Validity of the Solution 

The solution which we have obtained for equation (1) is valid in a 
domain containing the double point £ of g{e). It is difficult to give a 
precise specification of the extent of this domain in the general case. We 
shall assume here that/(«, tv) is a regular function of the two variables for 



344 


A. H. Rtad 


all finite values of w, when a lies in the attractive domain of £. This 
attractive domain is defined as follows. Let g n (a) denote the nth iterate of 
g(*). A double point ( of g{s) for which | g'(C) | < I is contained in a 
certain domain I\ such that lim g„(a B ) = £ whenever a 0 is a point of T 

(Montel, 1927, Chap. 8). This domain is called the attractive domain of £, 
and £ itself is called an attractive double point. 

We shall show that the sequence of functions £„(s) converges to a limit 
function which is regular, when a lies in I\ In fact, we know that there is a 
domain D x such that <!>„(*) tends to a regular limit when a is in D x . We 
define a set of domains D n by saying that Z> m+1 is the domain which is 
mapped on D m by the transformation a -*■ g{a). T is then contained in the 
union of all the D's. We make the induction hypothesis that ^ n (x) tends 
to a regular limit when a lies in Z?„. Now let a be a point in Z> m+l . We 
have 

and the right-hand side tends to /[a, ${g(s)}] as n -*■«. But this is a regular 
function of a. Therefore, by induction, <f>„(a) tends to a regular limit 
function when a is in any of the domains D n \ and does so, consequently, 
when a is in the attractive domain of £. 

6 . The Repulsive Double Point 

A repulsive double point oigia) is a double point £ such that | g\C) | > 1. 
If (i) t is a repulsive double point of g(a), (ii) j 3 is such that /(£, / 3 )=/ 3 , 
(iii) /(a, w) is a regular function of the two variables (a, tv) in the vicinity of 

({, j8), (iv) (—) +0, then, by the implicit function theorem for functions 

vWf,s 

of two variables, the equation 

/[«, #*<*)}]-#*> 

may be expressed on the form 

m, 

where F(a, w) is an analytic function of the two variables (a, tv) in the 
vicinity of (£, p), and ft is such that P=F ({, fS). But, since \g f (t) | > 1, 
there exists a branch g- x (a) of the inverse function which is regular at £ 
and such that | g ’- x (0 | < 1. Therefore the functional equation can be 
still further transformed to 

<AOO - <Hg- 1WH. 



The Solution of a Functional Equation 


345 


which is of the form 


This is of the type previously treated, provided that 

8 / 


~ <1/(0 I at ({,/»); 


a condition which corresponds to 


dw 


> »/!/(£) I- 


REFERENCES TO LITERATURE 

Fatou, P., 1920. “Sur les Equations Fonctionnclles”, Bull. Sac. Math . France, 
xlviii, 261. 

Koenigs, G., 1884. “ Rerhcrches sur les Equations Fonrtionnollcs ”, Ann . Sci, £c. 

Norm . Paris } ser. 3, i, supplement, 19. 

Montel, P., 1927. Lemons sur les Families Normales de Fonetions Analytiques, 
Paris. 


{Issuedseparately August 23, 1952) 



( 346 ) 


XXV.— The First Chemical Society, the First Chemical Journal, 
and the Chemical Revolution.* By James Kendall, M.A., 
D.Sc., LL.D., F.R.S., P.R.S.E. (With Two Plates.) 

(Address of the President at a Meeting held on July 7, 1952) 

(MS. received June 5, 1953) 

1. The First Chemical Society 

In very olden days chemists did not forgather merely as chemists; they 
merged themselves in broader organizations such as the Royal Society. 
The “chemical revolution”, which had its real beginning with the work 
of Joseph Black and which culminated in the overthrow of the phlogiston 
theory by Lavoisier, aroused for the first time a popular interest in the 
special science of Chemistry. Until recently, world priority among the 
chemical societies that resulted therefrom was by general agreement 
conceded to the Chemical Society of Philadelphia, founded by James 
Woodhouse in 1792. The distinguished chemical historian Edgar 
F. Smith (1), late Provost of the University of Pennsylvania, may be 
quoted in this connection: 

This was the first chemical society in the world. As far as can be learned, 
Woodhouse was its first and only president. Thu society lived about seventeen 
years. Its members favoured Lavoisier’s doctrine of combustion. The minutes 
of the society have never been found, although diligent search has been made 
for them. 

Copies of a number of publications of the society, however, have been 
preserved. One of them, Robert Hare’s memoir on his invention of the 
oxyhydrogen blowpipe, represents a real landmark in scientific discovery. 

The title of the Chemical Society of Philadelphia was not questioned 
until 1935. In April of that year, the Edinburgh University Chemical 
Society met to celebrate its Diamond Jubilee, being under the impression 
that it was founded only sixty years previously, and I was requested to 
prepare a speech for the occasion. Reading Sir William Ramsay's 
biography of Joseph Black to obtain some historical material for this 
oration, 1 came across the statement (2) that, among the correspondence of 
Joseph Black, Ramsay had discovered a sheet of paper, of which only the 

* This address was assisted in publication by a grant from the Carnegie Trust for 
the Universities of Scotland. 



the First Chemical Journal, and the Chemical Revolution 347 

date, 1785, was in Black’s handwriting, entitled "List of the Members of 
the Chemical Society”. Ramsay remarked regarding this as follows: 

This may have been a society of persons residing in Edinburgh interested in 
Chemistry, but is more likely to have been a general society. . . . The only 
name that I can recognize is that of Dr Thos. Beddoes [the founder of the 
Pneumatic Institute and the “discoverer” of Sir Humphry Davy]; the names 
themselves would indicate that their possessors belonged to all parts of the 
kingdom. 


The complete list of names—fifty-nine in all— 

-is reproduced below. 

John Webster 

Hen. N. Ward 

J. Alderson 

Wm. Scott 

Morgan Deasy 

J. McElwaine 

-Halliday 

H. Pache 

T. Gill 

Jas. Forster 

Jno. Gay 

T. Willson 

Sam. Black 

Tho. McMorran 

Frs. Montgomery 

Jno. Black 

Adam Gillespy 

Thos. Swain son 

Jas. Plumbe 

Thos. Burnside 

Archd. Webb 

Wm. Johnston 

Com. Pyne 

J. Crumbic 

Thos. Clothier 

J. Unthank 

Geo. Marjoribanks 

Henry Johnston 

J. Barrow 

T. Grieg 

Peter Gemon 

J. Donav&n 

J. Pan- 

Robt. Ross 

Sami. Macoy 

Alex. Stevens 

L. van Meurs 

G. Tower 

Wm. Symonds 

(amteemann batavis) 

J. Sedgwick 

Thos. Beddoes 

Hugh Brown 

T. Skeete 

J. Thompson 

John Boyton 

Wm. Robertson 

G. Kirkaldie 

Edw. Fairclough 

J. Sprolc 

J. Carmichcl 

Bicker McDonald 

Thos. Cooke 

Nich. Elcock 

E. Galley 

Thos. Edgar 

Richd. Gray 

An. Mann 

Guyton Jolly 

J. Hayle 


The idea struck me that an examination of the register of students at 
the University of Edinburgh at that period might identify more of the 
names in the above list. This register was accordingly consulted, and 
within a quarter of an hour it was established that no fewer than fifty-three 
out of the fifty-nine were students attending Black’s class in Chemistry at 
Edinburgh University during the years 1783-87. Thanks to the kindly 
zeal of Dr Alexander Morgan, five of the missing six were also subsequently 
traced as registered students of Joseph Black between 1780 and 1788. 
The only name on the list definitely unlocated is that of Peter Gemon. 

In the light of the above discoveries, it became my pleasant duty to 
inform my fellow-members of the Edinburgh University Chemical Society 
at their "Diamond Jubilee Luncheon” that they had been called together 
under false pretences, since they were actually celebrating their Sesqui- 
centenaryl For it is quite plain that the society of 1785 was not, as 
Ramsay surmised, a general society. It was, on the contrary, a society 
consisting of those members of Black’s class with a special interest in 



348 


Janus Ktndall, Tht First C/umical Socitty, 


chemistry—in other words, it was the Chemical Society of the University 
of Edinburgh. How long it survived after 1785 we have still no know¬ 
ledge, but once the origin of its membership had been fixed, the possibility, 
noted by Ramsay, of locating "some one of their descendants in possession 
of some record of its proceedings and history" clearly became much more 
likely of realization. 

Nineteen out of the fifty-nine members are on the list of medical 
graduates of the University of Edinburgh between the years 1784-90. 
This list also gives the nationality in each case, and it is interesting to 
find that, of the nineteen, only three were native Scots, three English, and 
the residual thirteen all Irish! Joseph Black was himself, of course, of 
Belfast ancestry, and the proportion of Irish students in the Scottish 
Universities has always been significant; but such a preponderance as 
this is very surprising. Whether its Irish element was responsible for the 
disruption of the society (it is ominous that Bicker McDonald and 
J. Unthank were both Hibernians) is a question that, perhaps just as well, 
we are not in a position to press. 

The first chemical society in the world to complete a hundred years of 
continuous existence was the Chemical Society of London, the official 
records of which state that "on the 23rd of February, 1841, twenty-five 
gentlemen interested in the prosecution of Chemistry met together at the 
Society of Arts to consider whether it be expedient to form a Chemical 
Society". Even after the publication (3) of my account of the origin of 
the Chemical Society of the University of Edinburgh, the Chemical 
Society of London could still, quite justifiably, object that no evidence 
had been adduced that the Chemical Society of the University of Edinburgh 
ever really functioned as a society. Nothing definite was known of its 
activities; there were no publications, no records of meetings, merely a 
list of names. How this deficiency was remedied will now be described. 

2. The First Chemical Journal 

In March 1947 I received a letter from the Rev. P. J. McLaughlin, 
D.Sc., of St Patrick’s College, Maynooth, to the effect that he had dis¬ 
covered in the archives of the Royal Irish Academy, Dublin, a folio 
volume containing a collection of " Dissertations read before the Chemical 
Society instituted in the beginning of the Year 1785". This volume, 
according to its title-page (PI. I), had been presented to the Royal Irish 
Academy a century previously, in January 1846. No specific mention of 
its source was apparent anywhere in the volume, and it remained a puzzle 
to D'r McLaughlin until he was referred by chance by Dr Farrington, 



the First Chemical Journal , and the Chemical Revolution 349 

Secretary of the Academy, to an article which I had written in Endeavour 
in 1942 on “ Some Eighteenth-Century Chemical Societies” (4). A par tial 
list of the contributors, which Dr McLaughlin sent me, consisted of names 
that all appear on Joseph Black’s sheet. 

I wrote at once to the Council of the Royal Irish Academy requesting 
them to be kind enough to loan the volume to the Royal Society of 
Edinburgh so that I might have the opportunity of inspecting it carefully. 
This favour was promptly accorded, and examination of its contents soon 
convinced me that I indeed held in my hands the first volume of the 
Proceedings of the Chemical Society of the University of Edinburgh. 
Presumably the unknown secretary of the society, to whose admirable 
diligence this record owes its being, was one of the formidable fraction 
of Irish members, and retained possession of his handiwork when he 
returned to his native country. 

The book is a well-bound folio of 452 pages in copperplate manuscript, 
and contains thirty-two dissertations on topics of chemical interest. Some 
of the title-pages to the communications are true works of art (PI. II). 
All of the contributors are on Joseph Black’s list, with the exception of 
Nos. 22 and 27, Mr William Lecky and Mr S. Latham Mitchill, and it must 
be assumed that these gentlemen joined the society after its inception. At 
the conclusion of Paper 15 the words are added: Edin r , 26 th , Nov*—1785; 
this is the only direct reference to Edinburgh in the volume. The 
beautifully clear handwriting throughout does not indicate, as I fondly 
supposed on first inspection, that students of that period were all experts 
in penmanship; its identity in numerous papers bears witness to the 
employment of professional scriveners. These occasionally could not 
decipher chemical terms contained in the manuscript and left blanks for 
their subsequent insertion; where such insertions have been made they 
are still apt to be almost illegible. 

Unfortunately, the thirty-two communications include very little 
matter of scientific importance, and my expectation that the volume might 
afford a valuable, as well as a unique, record of contemporary chemical 
thought has not been felicitously realized.* I had hoped to find the 
great Joseph Black himself participating in discussions upon the onslaught 
that Lavoisier was then launching against the phlogiston theory of Stahl, 
but Black suffered much from ill-health during his latter years and there 
is no note of his attendance at any of the meetings of the society. Refer- 

* Perhaps I had pitched my standards too high, for I have since made a point of 
examining Volume I of the Memoirs of the Chemical Society of London, published in 
1843, and although this contains communications by chemists of such eminence as 
Bunsen, Liebig, Graham and Playfair, there is not very much else in it that is really 
stimulating. 



350 


James Kendall , The First Chemical Society, 


ences to “ the learned Doctor ” and to his work, however, are very frequent. 
Most of the papera are routine descriptive chemistry of the classical 
period, and one of the most promising of the few exceptions—‘ 1 An Attempt 
to point out some of the Consequences which flow from Mr Cavendishes 
Discovery of the Component Parts of Water", by Mr Thomas Beddoes— 
peters out, after a tantalizing start, into an addendum entitled "A Con* 
jecture concerning the Use of Manure " I Evidently the youthful Humphry 
Davy was not unduly disrespectful when he described Dr Beddoes, under 
whose direction he was working at the Pneumatic Institution at Clifton 
in 1799-1800, to be "as little fitted for a Mentor as a weather-cock for a 
compass". 

This disappointment, however, emboldened me to make a second 
request to the Council of the Royal Irish Academy—a request which I 
should not have felt justified in making if the volume had proved to 
be of greater intrinsic value—namely that the Council should agree to 
return the volume to its original owners, the Chemical Society of the 
University of Edinburgh. To this request the Council of the Royal 
Irish Academy graciously assented, and I was accordingly able, at a 
meeting of the Society held on November 25, 1947, to make formal 
restoration to the first chemical society in the world of the first volume 
of its Proceedings, now lying on the lecture-table before me. Even if it 
has no great scientific significance, it does after all constitute the first 
journal extant of a purely chemical character, antedating by five years 
the Annates de Chimie initiated in Paris in 1790, and extracts from it may 
well prove worthy of wider publication. 

Careful examination, indeed, has disclosed that one paper does contain 
matter of significant historical interest, as will be developed in the next 
section of my address. Meanwhile let me digress for a minute by reciting 
for your edification the introductory sentence of a topical communication 
by Mr Richard Webb "On Coal": 

Among the innumerable Gifts which Nature has so liberally bestowed on 
Man, few seem to be more useful or better adapted to the conveniences of Life 
than the Lithanthrax or Pit Coal; and tho’ not an exclusive, yet may be stiled a 
peculiar blessing to Britain, from their plenty, their excellence, and being found 
conveniently situated for exportation. 

3. The Views of Mr John Carmichaell on Phlogiston 

The penultimate paper of the first volume of the Proceedings of the 
Chemical Society of the University of Edinburgh, presumably * read in 

* Since Papers 1 to 15 occupy from the beginning of the year 178$ to November 26, 
1785 (see p. 349)- 



the First Chemical Journal, and the Chemical Revolution 351 

the latter part of the year 1786, ia entitled, "Some Account of the Theories 
of Combustion, of Heat, of Light, and of Colour", by Mr John Carmichaell. 
That this youthful philosopher was well aware that he had undertaken a 
heavy responsibility in discussing all these topics at once is seen by a 
Latin subscript to the title, " Lent me corrigite manu ”, which may be 
freely translated, "Don't spank me too severely for my mistakes I" We 
shall concern ourselves here solely with his views on the burning topic of 
phlogiston. How burning this question then was may be seen by the 
fact that it consumed five consecutive sessions of the Royal Society of 
Edinburgh in 1788, Sir James Hall appearing for the prosecution and 
Hutton for the defence. For the benefit of non-chemists, the essential 
point at issue may be very briefly (and, of course, very imperfectly) 
explained. The phlogistonists insisted that when a metal was heated in 
air and converted into an earth, or calx, phlogiston was expelled from it; 
the awkward fact that the calx weighed more than the original metal was 
explained by assigning to phlogiston a negative weight. The “pneumatic 
doctrine” of Lavoisier, on the other hand, maintained that the calx was 
a compound of the metal with the oxygen of the air. Mr Carmichaell 
plunges right into the fight in his first paragraph: 

Mr President: Stahl was the first who reduced Chemistry to a regular Science, 
and enriched it by addition, or at best the Extention, of Becher’s doctrine of 
the Inflammable Earth, Phlogiston, or the Principle of Inflammability. Altho' 
perhaps we owe the advanced State in which we find the Science at this day to 
the introduction of this Principal, as it has all along afforded such easy and dear 
explanation of all Chemical Phenomena, yet it is frequendy the cause of the 
utmost confusion and Ambiguity, as in speaking of it or endeavouring to describe 
it there are hardly two Persons who seem to possess any distinct or even the same 
ideas concerning it, or in describing it use the same language. It is covered with 
a dark Veil and protected by inexplicable Jargon. 

After citing a number of examples of such jargon, Mr Carmichaell 
proceeds: 

However hypothetical this doctrine may appear, it has for long met with 
Courtesy, and Men of Science have unanimously countenanced it, and perhaps it 
might long have remained to be so, luted in the most perfect security, had not the 
Experimenting Genius of a Ray, Hales, Priesdey, Bayen, Lavoisier or some other 
called in question on what Men of more confined or less exalted ideas of nature 
would have reckoned sacriligeous to have attempted. 

Before bidding adieu to the Manes of Phlogiston and wishing its Eternal 
Oblivion, the ingenuity of a Scheele and a Crawford demand our serious attention. 

The body of Mr Carmichaell’s paper consists, in fact, of a discussion 
of the experiments and theories of "the ingenious Mr Scheele”, "the 
studious Macquer”, "the learned Lavoisier” and "the ingenious 



Janus Kendall, The First Chemical Society, 


35 * 

Dr Lubbock”. Having then explored the nature of heat, light and 
colour rather less diffusedly, he concludes as follows: 

And asking forgiveness of the Society for the many inaccuracies, perhaps 
neither few nor small, that I may have been led into, I conclude with the following 
line from Virgil which I think is applicable to this fanciful Proteus Phlogiston: 

Venit summa dies— 

Fuimus Troes, fuit Ilium et ingens gloria Teucrorum. 

Now what is there remarkable in the fact that thjs young medical 
student should have announced so confidently in 1786 that the last hour 
of phlogiston had arrived? To answer this question I shall need to 
discuss the contemporary views of more famous chemists, particularly 
Joseph Black. 


4. The Views of Joseph Black on Phlogiston 

The opinion is still common among chemists that Joseph Black was 
among the last to abandon the theory of phlogiston and to accept the 
ideas of Lavoisier. P. Rousseau (5) has recently painted this picture: 

Stahl’s phlogiston loses its supporters one by one. . . . Chaptal has already 
banished phlogiston from his course of lectures at Montpellier; Berthollet follows 
suit in 1785. In 1786 it is the turn of Guyton de Morveau ... the following 
year it is Fourcroy who bums what he has idolized . . . and claims the glory of 
being the first to teach officially Lavoisier’s theory. Abroad, there is a much 
more lively resistance, especially in England and Germany. Slowly, however, 
the learned men renounce their errors; Jacquin in Holland, the famous Black.. . . 
There are only a few irreconcilables . . . 

This misrepresentation originates essentially from John Robison, 
Professor of Natural Philosophy in the University of Edinburgh and the 
first Secretary of the Royal Society of Edinburgh, who edited the post¬ 
humous edition of Black’s Lectures on the Elements of Chemistry. In 
the course of his editorial observations thereto, Robison (6) remarks: 

Mr Lavoisier saw that his theory of combustion depended on the doctrine of 
latent heat, and was extremely anxious to obtain Dr Black’s acquiescence. . . . 
Learning that Dr Black thought well of his theory, and had introduced it into 
his lectures, he wrote to him on July 14,1790, as follows: 

“J’apprends avec une joie inexprimable, que vous voulez bien attacher 
quelque merite aux idles que j'ai professl le premier contre la doctrine 
phlogistique. Plus confiant dans vos idles que dans les miennes propres, 
accoutuml k vous regarder comme mon maitre, j’ltais en defiance contre mot 
mime (credat Judceus Apella *) tant que je me suis ecartl, sans votre aveu, 
de la route que vous avez si glorieusement suivie. Votre approbation, 


* This sarcastic insertion is undoubtedly by Robison. 




the First Chemical Journal , and the Chemical Revolution 353 

Monsieur, dissipe mes inquietudes, et me donne un nouveau courage. Je ne 
serai content jusqu'k ce que les circonstanccs me permettent de vous aller 
porter moi mime le temoignage de mon admiration, et de me ranger au 
nombre de vos disciples. La revolution qui s’op&re en France devait 
naturellement rendre inutile une partie de ccux attaches k l’ancien adminis¬ 
tration, il est possible que je jouisse du plaisir de la liberty et le premier usage 
qui j’en ferai sera de voyager, et surtout en Angleterre, et k Edinburgh, pour 
vous y voir, pour vous entendre, et profiter de vos legons et de vos conseils.” 

Dr Black wrote him a very plain, candid and unadorned letter in answer, 
expressing his acquiescence in his system. Mr Lavoisier answers this by praising 
in the highest terms the elegance of the style, the profoundness of the philosophy, 
etc. etc. and begs leave to insert the letter in the Annales de Chymie . Dr Black, 
who had been in very low spirits when he wrote that letter, and was much dis¬ 
satisfied with its feebleness, was disgusted with what he now conceived to be artful 
flattery and refused to grant the request. Yet his letter appeared in that work 
before his refusal could reach Paris. 

This wheedling, in order to screw out of Dr Black an acquiescence, on which 
he put a high value for the influence which it would have on the minds of others, 
was surely unworthy of Lavoisier. Dr Block was not only disgusted with the 
flattery, but seriously offended with its insincerity; and with a sort of insult on his 
common sense, by the supposition that he could be so wheedled, by a man whose 
publications never expressed the smallest deference for his opinions. For, by 
this time, Dr Black had read Mr Lavoisier’s Elements of Chemistry , and the 
various dissertations by him and Mr de la Place, published in the Memoirs of the 
Academy . His name is not once mentioned, even in the dissertations on the 
measures of heat, where his doctrine of latent heat is delivered and employed as 
the result of Mr Lavoisier's own meditations. Nor is he named in those passages 
of the earlier dissertations, where the characters and properties of fixed air, and 
of the mild and caustic alkalis, are treated of. All appears to be the train of 
Mr Lavoisier’s own thoughts, for which he was indebted to no man. Such 
inconsistency with the deference expressed in the above cited letters, provoked 
Dr Black to such a degree, that he resumed his critique on the nomenclature, 
and began to express his dissatisfaction with some parts of the theory, and his utter 
disapprobation of the unscientific and bullying manner in which the French chemists 
were trying to force their system on the world. But, by this time, his health had 
become so delicate, that the least intensity of study not only fatigued him, but 
made him seriously ill and forced him to give it up. I saw him but seldom at 
this time, being then in very bad health myself, but had this information from 
Dr Hutton, who shared all his thoughts. 

Is there, in point of fact, an atom of truth in all the information that 
Robison gathered from Hutton (not an impartial witness; see p. 351), and 
in the imputations based thereon? That question can now be definitely 
answered in the negative, owing to a discovery made in 1949 by 
Dr McKie (7), and this section of my address can appropriately conclude 
with a condensed account of that discovery, since it leads me back once 
more to Mr Canmichaell. 

In September 1793 two representatives of the Paris revolutionary 
committee called upon Lavoisier to search and seal his papers. Some 
letters written in English were taken away and sent to the Comitl de 



354 James Kendall, The First Chemical Society, 

stireti grfndrale; they are now in the Archives de France, and among them 
are two from Joseph Black. The first, dated 24 October 1790, is the 
“very plain, candid, and unadorned letter’* alluded to by Robison. The 
following extract is relevant here: 

You have been informed that I endeavour in my Courses to make my Pupils 
understand the new principles and explanations of the Science of Chemistry 
which you have so happily invented and that I begin to recommend them as more 
simple & plain and better supported by Facts than the former system, and how 
could I do otherways? Your numerous & well contrived experiments have been 
performed with such uncommon accuracy & attention to every circumstance of 
any importance and with such quantitys of the materials that nothing can be more 
satisfactory than the proofs of the Facts which you have investigated. And the 
System you have founded on these facts is so strictly connected with them and so 
simple & intelligible that it must be approved more & more every day and will 
even be adopted by many of those Chemists who have long been habituated to the 
former System: To gain them all is not to be expected, you know too well the 
power of habit which enslaves the minds of the bulk of mankind and makes them 
believe & reverence the greatest absurditys. I must confess that I felt the power 
of it myself, having been habituated 30 years to believe & teach the doctrine of 
Phlogiston as formerly understood. I felt much aversion to the new system which 
represented as an absurdity what I believed to be sound doctrine this aversion 
however which proceeded from the power of habit alone has gradually subsided, 
being over come by the clearness of your demonstrations & consistency of your 
Plan and tho there are still a few particulars which appear to be difficultys, I am 
satisfied that it is infinitely better supported than the former Doctrine; In this 
respect they cannot be compared, nor is this surprising when we consider that in 
composing yours you had the advantage of a multitude of new Experiments 
made with a degree of ingenuity & accuracy unknown to the Chemists of the 
former age. But tho the power of habit may prevent many of the older Chemists 
from approving of your Ideas, the younger ones will not be influenced by the 
same power; they will universally range themselves on your side of which we 
have experience in this university where the students enjoy the most perfect 
liberty of chuseing their philosophical opinions. They in general embrace your 
system and begin to make use of the new nomenclature in proof of which I send 
you two of their inaugural dissertations in which chemical subjects were chosen; 
these Dissertations are wrote entirely by the students; the Professors have no 
share in them. We read them before they are printed to see that there are no 
gross absurditys in them & give our advice if any are found: we sometimes find 
extravagant compliments to ourselves but have not the modesty or discretion to 
strike them out. 

Black certainly does not seem to have written this letter in very low 
spirits, or to have any reason to be dissatisfied with its feebleness; Robison’s 
subsequent statement that he was so disgusted by the artful flattery of 
Lavoisier’s reply that he refused to permit its publication is refuted by a 
second letter, hitherto unpublished, sent from Edinburgh on December 28, 
1790, which begins thus: 

Mr Gahagan who is just returned to this Place made me happy with your 
letter of the 19th Nov r . last & with the account he gave me of the ardor with which 



the First Chemical Journal, and tk* Cfumical Revolution 355 

you still pursue your philosophical researches. It gave me pleasure also to find 
that you are satisfyed with the avowal I have made of my approbation of your 
System of Chemistry. You have my full consent to publish my letter. This 
consent I consider as a tribute I owe to truth and the eminent Rank you hold as 
a promoter & Patron of the Science of Chemistry: in publishing it you may leave 
out any parts of it which you think superfluous. 

Lavoisier was, accordingly, perfectly justified when he published a 
French translation (by Madame Lavoisier) of Letter I in the Annales de 
Ckimie in March 1791. As Dr McKie succinctly remarks: “The 
translation is unexceptionable and Black's wishes have been scrupulously 
observed. The whole incident does credit to two great chemists." 

Even Robison admits that Black considered the death of Lavoisier (he 
was guillotined by the revolutionaries in May 1794) as “a great loss to 
science"; Black himself would probably have expressed his feelings 
much more emphatically. In 1796, when he was compelled at last to 
relinquish his long-famous lectures, he introduced Thomas Hope to his 
chemistry class in the following terms: 

After having, for between thirty and forty years, believed and taught the 
chemical doctrines of Stahl, I have become a convert to the new views of chemical 
action; I subscribe to almost all M. Lavoisier’s doctrines and scruple not to teach 
them. But they will be fully explained to you by my colleague and friend 
Dr Hope, who has had the advantage of having them from the mouth of their 
ingenious author. 

It would be a mistake to interpret these words as indicating that Black 
was a recent convert to the new views. In my opinion, he was merely doing 
his best to give his young successor a flying start. Hope (8) is generally 
regarded as the first chemist to promulgate the doctrines of Lavoisier 
publicly in Great Britain, at the University of Glasgow in the winter of 
1787-88, but I shall now proceed to cite strong presumptive evidence that 
Black actually anticipated him. 

5 . Dr John Carmichaell, President of the Chemical 
Society of Edinburgh 

By good fortune, Mr John Carmichaell is one of the nineteen original 
members of the Chemical Society instituted in 1785 whose name appears 
on the list of medical graduates of the University of Edinburgh (p. 348). 
This inspired me to secure a copy of his inaugural dissertation from the 
University Library, in the hope that it might contain matter of interest. 
My hope has, indeed, been richly fulfilled. 

The dissertation, entitled Dt Fermentations, was submitted in August 
1787. The following extract from its introductory paragraphs will suffice 



356 James Kendall t The First Chemical Society , 

to prove that Mr Carmichaell was still of the same mind with regard to 
phlogiston as he was in 1786. I reproduce here the Latin text of the 
original, and append my own translation in a footnote: 

Nostri eevi philosophi chemiam multum excoluere; hodiema expcrientia 
scientist multum luds protulit; de multis procesaibu* nunc fideliter disserere licet; 
sens necessitas nunc bene edocta; phlogiston, vagum prindpium, omnium fere 
consensu, ex chemia foras ejectum, dum umbrae loco, substantia doctis hodie 
foveatur. 

His causis, doctrinam pneumaticam dictam elegi . . .* 

On the basis of this doctrine, consequently, Dr Carmichaell develops 
his treatise on fermentation, with references to Black ( clarissimus ) and 
Lavoisier (, hodie primartus scientissimus). His argument, however, is not 
very interesting from a modem point of view, for he is flogging what is 
now a very dead horse. Perhaps the most significant line in his thesis 
is on the title-page, where he proudly announces himself Societ . Chem . 
Edin . Soc . Ext . et Press . Annuus.-f For this makes it clear, firstly, that 
the Society was not an ephemeral organization, but survived at least into 
the year 1787; and secondly, that the anti-phlogistic opinions expressed 
in Mr Carmichaeirs paper to the Society (pp. 351-352) received such 
general approbation that his fellow-members almost immediately thereafter 
proceeded to elect him their president. It is nearly certain, moreover, in 
view of the final paragraphs of Black's first letter to Lavoisier (p. 354), 
that Mr Carmichaell’s inaugural dissertation of 1787 was one of the two 
exhibits sent to Paris in 1790. 

Although Black states that his students enjoyed “the most perfect 
liberty of chuseing their philosophical opinions’ 1 , few teachers will doubt 
that the majority moulded their views on the pronouncements of their 
professor, and the first volume of the Proceedings of the Chemical Society 
of the University of Edinburgh renders it practically certain that Black 
himself climbed over the phlogiston fence before 1785-86. Space con¬ 
siderations preclude the presentation of more than three short extracts. 
Mr William Scott, in the very first paper read before the Society, "On 
Chemical Attraction”, observes: 

Aqua regia dissolves gold. . . . Are we with Mr Scheele to suppose that 
here a double Elective Attraction takes place, that the Nitrous Add attracts the 


* The philosophers of our age have advanced chemistry considerably; current 
knowledge has brought much light on the sdence; it is now possible to discuss many 
operations exactly; the necessity of the air is now well inculcated; phlogiston, that 
mysterious prindple, has by almost unanimous consent been cast out of chemistry, while 
in place of shadow, substance is to-day favoured by the well-informed. 

For these reasons, I have chosen the doctrine called the " pneumatic doctrine”. 

+ Extraordinary Member and President for the Year of the Chemical Society of 
Edinburgh. 




the First Chemical Journal, and the Chemical Revolution 357 

phlogiston of the Muriatic Add, and thus renders it capable of acting on the 
metal? No, for before his hypothesis be accepted he must prove the existence 
of phlogiston. 

Mr Samuel Black, in Paper 3, "On Nitre”, is still honestly 
bewildered: 

Dr Stahl imagined that these phenomena were to be explained from the rapid 
union of the nitrous add with Phlogiston. Mr Fourcroy on the contrary thinks 
that the appearances are to be explained from the rapid combustion of the 
Inflammable matter by means of the great quantity of pure Air disengaged in the 
process. Having stated the diversity of opinion among these illustrious chemists, 
it might well appear presumptuous in me to dedde in favour of any of them. 

But Mr William Haliday, discussing "The Calcination of Metals” in 
Paper 16, remarks: 

Innumerable Hypotheses have been formed, a detail of which would be not 
only very tedious but also very useless, one of them however I cannot pass over in 
Silence, that of the illustrious Stahl, as it has been the one universally received 
till the late discoveries of Messieurs Bayen, Lavoisier, etc., which have thrown 
new light on this Subject. 

These, and many similar quotations that might be added, make it 
apparent—to my mind at least—that Black, in 1785-86, while endeavouring 
to present both doctrines to his chemistry class as impartially as possible, 
was leading the more perspicacious among them from the old path to the 
new. When leisure allows me, I plan to look through all the Inaugural 
Dissertations of a chemical nature submitted to the Faculty of Medicine 
in the years between 1785 and 1790 to check this assumption and to 
discover how many other converts to Lavoisier's ideas, besides Dr 
Carmichaell, testified in Edinburgh during that period. The search will 
also, I hope, enable me to extend the "active life” of our first chemical 
society still further, perhaps even beyond 1790. Unfortunately, the 
dissertations are printed entirely in Latin, so that my task will not be an 
easy one. 

I wish, in particular, to find out more about Dr John Carmichaell; 
his thesis does disclose a few additional details. First, the word Scotus 
after his name indicates that he was among the native minority of the 
original members of the Chemical Society (p. 348). Second, he was 
also a member of the Royal Medical Society, a student organization 
founded in 1737 (what pioneers Edinburgh students of that century were I) 
and still flourishing. Third, he dedicates his dissertation to his kinsman, 
Thomas Hay, Esquire, of the Royal College of Surgeons of Edinburgh. 
So far, my search for facts of his later life has been fruitless, but I am still 



358 James Kendall , The First Chemical Society 

on his trail. Perhaps some day Volume II of the Proceedings of the 
Chemical Society of Edinburgh, issued under his presidency, wiU come 
to light and provide me with another of his pungent effusions to report 
upon to you. Meanwhile, for whatever errors I may have fallen into in 
my present commentary, Leni me corrigite manu. 

Note added August 16, 1952. 

Through the kindness of J. G. Kyd, Chairman of the Scots Ancestry Research Society, 
I have been furnished with the following data regarding Dr John Carmichael]: 

1. His father, Dr Michael Carmichael “of Hizellhead in the parish of Carmichair', 
was married on 25th April 1756 to Miss Mary Hay, daughter of John Hay, W.S., of the 
North-East parish of Edinburgh. 

2. The Carmichael family appears to have been resident at Eastend, Carmichael (a 
village near Lanark), for centuries. John’s elder brother Maurice became head of the 
family on the death of his uncle in 1789. 

3. John Hay, W.S., a great-grandson of Sir John Hay of Barra, accompanied Prince 
Charles to France and was attainted in 1745. It was to his nephew Thomas Hay, M.D. 
of Edinburgh (bom 1751—died 1816), the fifth son of Thomas, Lord Huntington, that 
Dr John Carmichael dedicated his inaugural dissertation in 1787. 

4. John himself was bom at Hazelhead, Carmichael, on 20th February 1766. No 
details of his career after graduation have yet come to light. 


REFERENCES TO LITERATURE 

(1) Smith, 1914. Chemistry in America, D. Appleton & Co., 12. 

(2) Ramsay, 1918. The Life and Letters of Joseph Blach, M.D., Constable & 

Co., no. 

(3) Kendall, 1935. Journ. Chen. Educ xii, 365. 

(4) Kendall, 1943. Endeavour , 1, 106. 

(5) Rousseau, 1945. Histoire de la Science, Bronty, Fayard et Cie, 388. 

(6) Robison, 1803. Lectures on the Elements of Chemistry, by the late Joseph 

Block, II, 9l8~220. 

(7) McKje, 1949. Notes and Records of the Royal Society of London, vii, 1. 

(8) Kendall, 1944. Endeavour, hi, iso. 


(.Issued separately October 20, 1952) 






Proc Roy Fd t A ] 


V 1 I Kill 



James Kendali 


Prof. Hoy. Soc. Edin., A] 


Vol. LXIII 





( 359 ) 


XXVI.— The Normal Penetration of a Thin Elastic-Plastic Plate 
by a Right Circular Cone.* By J. W. Craggs, Ph.D., 
University College, Dundee. Communicated by J. M. Jackson, 
Ph.D. (With Four Text-figures.) 

(MS. received February 5, 1952. Revised MS. received May 28, 1952. 

Read July 7, 1952) 

Synopsis 

A study is made of the propagation of elastic and plastic deformation in a thin plate, 
initially unstressed, and of infinite extent, when it is penetrated normally by a cone 
moving with uniform velocity. The work is an extension of unpublished researches by 
Sir G. I. Taylor on the corresponding problem for a thin wire, and a summary of his 
results is included. 


Introduction 

The propagation of plastic deformation in a thin wire, initially just taut 
and at rest, when it is struck by a projectile moving at right angles to it, 
has been studied by Professor Sir Geoffrey Taylor in unpublished work. 
In this paper a similar method is used for the discussion of the plastic 
and elastic waves set up in a thin plate, initially flat and free from stress, 
when it is penetrated normally by a conical-headed projectile moving 
with uniform velocity. 

i. Tension Waves in a Thin Wire 

The analogy between the behaviour of a thin plate and that of a thin 
wire is sufficiently close to make it worth while to summarise Taylor's 
results for the latter case. 

Consider a thin wire of infinite length, initially taut, unstretched and 
at rest. Suppose a sharp-pointed projectile strikes the wire at x=o and 
continues to move, perpendicularly to the wire, with uniform velocity W. 
Use iji for the slope of the wire at a point x from the origin, t for the time, 
and E T for the tensile stress, where E is Young's modulus. Let r be the 
ratio of the cross-sectional areas in the deformed and initial states, and 
p the density. It follows from similarity considerations that T, ift, r, U 
and V, where U, V are the longitudinal and transverse velocity com¬ 
ponents, are functions of \t=x/t only. 

* This paper was assisted in publication by a grant from the Carnegie Trust for the 
Universities of Scotland. 



360 /. W. Craggs, The Normal Penetration of a 

Geometrical considerations lead to the relation 



which reduces to 

cos 


(A - U)dtfijdX - cos 0^ 


,dV . .d 

'3a + ””^a 


o- 


( 1 ) 



The equations of motion 

pr sec ift-DUjDi « E(d!dx)TT cos 0 


and 

pr sec 02? V/Dt ■» - E(dfdx)Tr sin 0 


reduce to 


pr(A - U)(dUld\) - - E(d}d\){jT cos 0) (a) 

and 

pr(A - U)(dV/dX) •*E(djd\)(rT sin 0). ( 3 ) 

The transverse velocity, V, may be eliminated between the three equations, 
and there results the equation 

{(A -£/)*- {ETIp) cos* 0K^A) - o. ( 4 ) 


Equation (4) implies that </ 0 /</A is zero except at the point A=A, 
corresponding to the velocity of transverse waves in a wire of tension 
T— T % (say). For A > A* and A < - A, the wire has zero slope. For 
o < A < A* the slope is 0 ( =tan -1 W\ A, and for - Aq < A < o the slope is - 0 O . 
For a plastic-elastic wire there are three regions. For \—U> V( EJp. ) 
there is no disturbance. For y/{Ejp) > A - U > y/^TEfp) the displace¬ 
ment is purely longitudinal, and for A - U < -\/(TE/p) the normal velocity 
is equal to the projectile velocity. 


2. The Equations of Motion for a Thin Plate 

For an isotropic thin plate deformed by a right circular cone moving 
along its own axis, which is normal to the plate, the deformation is 
symmetrical about that axis, and for an initially stationary unstressed 



Thin Elastic-Plastic Plate by a Right Circular Cone 361 


plate there is dynamic similarity, with a single independent variable 
X=rjt, where r is the radius. Take radial and normal velocity com¬ 
ponents U and V ; radial and circumferential stress components Ea, 
and Eof] slope of the plate, measured in an axial plane, ip, and thick¬ 
ness ratio r (equal to current thickness : initial thickness). Then the 
geometrical relation (1) is still valid; the equations of motion reduce to 


and 


,d ^ . |x A(A - U) dV 

cos ^f— (At< 7, sin Ip )-—-— T— 


,d X(X-U) dU 

cos ^(Ara. cos ip) - to, - - -r—, 


(5) 

( 6 ) 


where c= y/(Ejp), and equation (4) is replaced by 


{ /A - £A*\ dip 
<r,cos?ip-y -I >A^ + <r # sin ip cos ip* 


(7) 


3. The Elastic Solution 

Consider first the case in which the plate is perfectly elastic and 
initially unstressed. Now the velocity of membrane-type waves of small 
amplitude is ^(T/p), where T is the radial tension in the plate. 
(This follows from (7), which shows that discontinuities of the 
derivative dipjdX occur only for the values of <r„ which make the coefficient 



\o X© 

Fio. 1—The «olution for a thin wire. 


in double brackets equal to zero.) The first wave to travel out¬ 
wards in the elastic plate must therefore be a tension wave travelling 
with velocity c' = ej V(i -*'*), appropriate to plane stress. Within the 
circle A=<■' there will be an annulus in which the displacement v is zero. 
Now in elasticity theory changes in the thickness of the plate can be 
neglected; the equation of motion then becomes 


d 

dX 


(AffJ-ff,- 


A(A -U)dU 
dX 


( 8 ) 


Next apply Hooke’s law, with Poisson’s ratio Then 
1/(1 - Sul dr) -1 — a, - far 


(9) 

(10) 



362 


/. W. Craggs , The Normal Penetration of a 


Now U, the velocity of an element, is given by irjit with r - 
(ef. Taylor, 1948). In terms of f—ujr this gives 

u constant 

u- - Ay/(i -/- A/ 0 , 

(11) 

and when second-order terms are neglected the last four equations reduce to 

A*(i - + 3 A(i - A*/W^-o. 

(is) 

Now u, and therefore /, is continuous at the singular point A* 
solutions of (12) are the one parameter family 

=c', so the 

, J(i-A«A*)* A* i+(i-A»A«)*\ 

fm - A \ a A 1 % log AA /• 

(13) 

where i*= 1 fc'*. 

The corresponding values of the other variables are 


U- -A( 1 -pxyi\ 

(14) 

. M1 (1 -A*A*)* + i\ 

"•-n jA- + *'° g AA /■ 

(15) 

J <1 -A'A 1 )* (I-A'A 1 )* ! i| 

"-n- 3 A- +g,| °* AA I' 

(16) 


The table gives the numerical values of the variables, calculated from 
these equations. 


Jtf 

1*000 

0-800 

0-600 

0-400 

0-300 

0-200 

O-IOO 

0*080 

0-060 

0-040 

oJA 

0-000 

0-637 

1-102 

1-891 

2-730 

4*881 

15-09 

21-86 

37*8 

So-99 

Otl A 

0-000 

0-403 

0-546 

0-459 

0*081 

- i-»43 

-10-60 

- 17-06 

- 3a-<H 

-7507 

UIXA 

0-000 

0-703 

1-667 

4*293 

7-950 

18-35 

77-<H 

116-8 

308-0 

468-4 


3.1, The Transverse Wave-front 

As the value of A is reduced from e\ the radial tension o, increases 
steadily until the singularity of (7), with ^=0, is leached (this is the 
velocity of membrane-type waves). 

At this value of A the theory of differential equations leads us to 
consider the possibility of a discontinuity of iji. Use the suffix zero for 
values in front of the wave (larger A) and one for those behind. Then 
continuity of v and of mass leads to 

(A - UJ tan ^ - (A - Uj tan V x - F* 

(A - sec & - (A - U&t % sec fa. 


(17) 

(18) 



Thin BlatHc-PlatHc Plat* by a Right Circular Con* 363 



The momentum equations give 

, (A - £/,) ti ( V t - VJ sec ifi t 
o h r x sin if/y-<r.T e sin <ft 0 ----, 

. (A-^,(^«-^)»ec^ 

a H r x cos & - o, r 0 cos &- — -. 

Elimination of V and tJt 0 leads now to 

„>• - jI7^{( ~T^ )' " 1 *•. «* 4 

A non-trivial solution of these equations is possible when 
o u cos* - (A - 
00s* ^ (A - 


and then 


(*9) 

(*o) 


3.2. The Bulged Region 

Now consider the region behind the transverse wave. Then Hooke's 
law leads to 

—— - <*t ~ fa* ( 3I ) 

r - u 1 -/ 


see ^ 
i - dujbr 


o.-W 


<**> 



364 /• W. Craggs, The Normal Penetration of a 

The dynamical equations, with constant thickness, reduce to 

,J„ . 1 \Qi-U)dV 

cos^Aa.sm*)- — 

A(A -U)dU 


cos cos ip) “00** - 

d\ c* 


d\* 


and as before, 


|ct # cos* *-(¥)' A^+ a t sin ^ cos ifi - o. 


(» 3 ) 

(» 4 ) 


(» 5 ) 


The required solution of these equations starts from the singularity 


a, cos* 1(1 - 



(26) 


and a detailed examination of the singularity, omitted here for brevity, 
shows that there are no solutions through it unless o § is positive. The 
table of § 3, however, shows that o t changes sign for A \c' > o-2, so this 
condition implies oJE > 0-04, and for most metals this is already outside 
of the elastic limit. Detailed examination of the elastic bulge is therefore 
omitted from this work. 


3.3. Transition to the Cone 

The considerations outlined in § 3.2 show that there will be no 
free elastic bulge in the plate. However, if the discontinuity in slope 
of the plate is at its point of contact with the cone, a different set of equations 
is needed, and such a transition may occur without any flat part of the 
plate being stressed beyond the elastic limit. 

The equations of § 3.1, which were obtained by using only the 
conditions at a discontinuity of slope, still hold. Thus the transition to 
the conical region takes place when 

<x,-(A- 

Now if the cone is assumed to be perfectly rough, the condition on the 
cone is U =0, Then (20) gives a relation between o (| and One of 
these parameters, say may be chosen arbitrarily, then (17) gives the 
value of V lt which must be the velocity of the cone. The two parameters 
tpx and A therefore correspond to the two-parameter system given by taking 
different cone angles and velocities. 

If the cone is assumed to be perfectly smooth, the equation of motion 



Thin Elastic-Plastic Plate by a Right Circular Cone 365 
relative to the cone of the material in contact with it is 

-a, - - sec* 

and use of the elastic equations derived from Hooke’s law leads to 

X\i - 3 A*/4^ + 3A(i - + 

where fa is assumed small. 

The solution of (28) is 

U- - A*/' 1 -B{ 1 -A*A*)‘}, (29) 

and the corresponding stresses are 

«**■/(!-M4» , <.-W)‘ + i\ „ 

I-IaT-^ 108 —IT—rw + T +c 

3 l 3 A* k \ ) 9A 1 

where D and B are arbitrary. 

The value of D may be found from consideration of the conditions at 
the transition to the cone. Determination of B depends on the condition 
at the tip of the cone. In fact, at that point A=o, U=o so B = I, and the 
stress components tend to infinity at the tip of the cone. This implies 
that the material becomes plastic before the tip of the cone is reached, 
and a new set of equations must be used. The difference between this 
case and the case to be considered in the next few sections is insignificant, 
and no special consideration will be given to it. 



(27) 


( 28 ) 


4. Plastic Solutions 

When the chosen value of the parameter A is large (and this will be 
the case in most practical examples), the clastic limit of the material will 
be reached for a value of A greater than that characteristic of transverse 
waves. There will then be a region in which the plate is flat and plastic. 

The strain velocities in such a region are 




366 /. W. Craggt, The Normal PtnetroHott of a 


The conservation of volume in plastic strain gives 
dU U X-Udr 
<fA + A~ r d\-°- 

This equation is used only for the determination of the thickness. 
The Huber-Mises flow equations reduce to 

«, dUjd\ 2<T t -l7 $ 

—“ ■ ■ 

«» 


• -K, 


( 3 ») 


( 33 ) 


U/X o, - 2 <T t 

where K is introduced for convenience in the numerical work. The yield 
criterion (Mises-Hencky) is 

a* - o,o, + of - Y*, (34) 

and strain hardening may be allowed for according to the law 

y-*m, ( 35 ) 

where W is the plastic work done per unit volume. P(tP) is regarded as a 
known function. 

Now W is given by 

dU V 

~ (X ~ uy dx~ ar d\ +af x- (36) 


Equations (32) to (36), with the dynamical equation, are sufficient for 
the determination of the dependent variables cr„ a„ r, U, Y and W. They 
can be solved numerically, starting from the values given, for definite A, 
by the elastic solution at the least value of A for which the yield condition 
is not violated. 

For a perfectly plastic solution the yield stress Y is constant and 
equations (35) and (36) are not needed. 


4.1. Tk* Bulge 


The solution given above holds, in general, for the annulus bounded 
internally by the circle corresponding to the velocity of transverse waves, 
which is given in § 3.1. Inside this circle the plate is bulged out of its 
own plane, and more general equations are required. 

Now the rate of strain component i, in such a bulge is 


«i 


1 dU X- Udifi 

—-—— — tan 

t dX t dX 


* 


so the conservation of volume equation is 
dU V X-Udr 


rr£* 


( 11 ) 



Thin Elastic-Plastic Plata by a Eight Circular Cone 
The flow equation (33) U replaced by 

XjdUjdX) - A(A - UXdtpIdX) tan tp 


U 


--X, 


and the rate of work equation by 

iW dV 
^dX~ Q, dX 

a t cos* ^ - (A - {7) 1 /** -P, 


dW ill dd> 

- (A - V)-ji “ - (A - u )°r^ tan ^+o,C//A. 


Write 

then (7) becomes 


dtp o t smtpcostp 

^A + --°* 


3^7 


(39) 


(40) 

(41) 

(4») 


Elimination of r between (6) and (38) leads to 
.dU 




- a(A - U)o 9 \-jr sin if/ cos if> 
d A 0 A 


+ A(A- U) cos* - (A- i/)a # + Aa # cofl*^»o, 

and use of (39) to 

A(A - U) cos* 4 ^ - P)i^- - (A - £/)of + o,(A - aATC/) cos* ^ - o. (43) 

«A «A 

(40) reduces to 

(A - U)dWjdX+ (f//A)(o,+ATo.) - o. 


Now near the singularity ?=o, (42) reduces to 

dP_ PX fa, 3(A-^) 
dtp Of sin tp cos A «* 


(44) 


so for real solutions either tp or o # must vanish at P=o. Moreover, the 
former possibility arises only for o ( > o,* and it is by no means clear that 
this can occur in practice. If it is assumed that o # =0, the value of o, in 
(20) follows from the yield condition, and (20) gives a relation between 
U\ and tp v If one of these is chosen arbitrarily, the computation of the 
stresses and velocities in the bulged region is quite straightforward, apart 
from a rather tedious power series expansion to get away from the 
singularity. 


* If it does, the initial value of d+jdX is the arbitrary parameter. 



368 


J. W. Craggs, Th* Normal Penttration of a 

4.2. Conditions on th* Con* 

For a rough cone, the computation of the bulge is continued to a 
point at which U vanishes. The corresponding values of tfi and V give the 
angle and speed of the cone to which the solution corresponds. 

For a smooth cone a trial-and-error solution must be used. At an 
arbitrary point of the bulge solution the equations are replaced by those 
appropriate to motion on a smooth cone, tangential to the plate at that 
point. This solution is then computed. If the choice of the junction 
between plate and cone is correct, then at A =0 U vanishes. If this is not 
so, another cone solution must be computed, commencing from a different 
point of the bulge solution. 

The equations on the cone are 


d„ s X(X-U) , , dU 

^(XtoJ-to, -J-T sec* * • dX , 

( 45 ) 

where ifi t is the cone angle, with 


dU X 

dX u~ K ' 

(46) 

o, l -a,a t +o t l - y\ 

( 47 ) 

Y-F{W\ 

(48) 

- (A - U)(dtV/dX) - a,(dl/fdX) + 0 / U/X), 

( 49 ) 

dU V X-Udr 
dX + X c dX“°‘ 

(So) 


A slight difficulty arises in the computation at points where K is 
infinite (a, -2 o,=o). To avoid this trouble, (46) may be replaced by the 
Reuss equation, 

dVjdX - (d/dXXo, - to) 

U/X-(dldX)(<r t -to " ’ 

for the computation near that point. 

5. Summary of Solution Types 

For different velocities and angles of cone the solutions fall into four 
types. 

(i) For low speeds and wide cone angles the plate is elastic and flat 
from infinity to the cone. There is a sharp bend where it meets the cone, 



Thin Elastic-Plastic Plate by a Right Circular Cone 369 

and a discontinuity of radial stress at this radius. On the cone there will, 
in general, be both elastic and plastic regions. 

(ii) As the speed is increased or the cone angle reduced, a new solution 
type appears, in which there is a plastic flat region between the elastic 
part of the plane, and a bulged region. At the inside of the bulged region 
the plate is tangential to the cone, and for a smooth cone plastic deformation 
continues right to the tip of the cone. 

(iii) If now the velocity of the cone is increased, the bulged region is 
reduced until at a certain velocity for any cone angle the bulge disappears, 
and the plate is flat except where it is in contact with the cone. 



Fio. 4. —Form of the plate in the bulge aoludoni. 


(iv) For a non-strain-hardening material, with a constant yield stress 
Y, or for any stress-strain curve with a horizontal asymptote, there is a 
definite limit y/( Yfp) to the possible transverse wave velocity. If then 
the cone has a large angle and a large velocity, the resolved part of the 
velocity of its surface in the plane of the plate will exceed this value. The 
above solutions then break down. 

The solution may, however, be easily constructed. The plate will 
remain flat except where in contact with the cone, but at the circle of 
first contact with the cone where will be a shear stress, say ET, in the plate. 
This will be balanced by a pressure between the cone and the plate, 
concentrated over a small region near that circle. 

The equations at the discontinuity of slope are then 

o rt T x sin ijtx + r x T cos ^-{(A - U x )r l V x sec 
and 

o^Tj cos - Tr x sin ^ - <j^t 0 - {(A - U^)r x { £7 0 - £/i) sec 

where 0 X is the angle of the cone. The equations corresponding to (19) 
and (20) are then 

(A - U x )*/<*- o H cos* ipi-T cosi* ^x/sin 

and 

(A - Utfl* - - (T cos & sin &)/(A - UJ - o, 

and there is no restriction on the value of A at which such a discontinuity 
is possible. 



370 


Thin Elastie-Plasti* Plat* by a Right Circular Con* 


Acknowledgment 

My thanks are due to Professor Sir Geoffrey Taylor for suggesting 
the problem, and for his help and encouragement. 


REFERENCE TO LITERATURE 

Taylor, G. I., 1948. “The Formation and Enlargement of a Circular Hole in 
a Thin Plastic Sheet”, Quart. Joum. Meek. App. Math., t, pp. 105-124. 


(Issued separately October 20, 1952) 



( 37i ) 


XXVII.— The Rotational Field behind a Bow Shock Wave In 
Axially Symmetric Flow using Relaxation Methods.* 
By A. R. Mitchell, Ph.D., and Francis McCall, B.Sc., 
United College, University of St Andrews. Communicated by 
Dr D. E. Rutherford. (With Five Text-figures.) 

(MS. received April ia, 1952. Read July 7, 1952) 

Synopsis 

The relaxation technique of R. V. Southwell is developed to evaluate mixed subsonic* 
supersonic How regions with axial symmetry, changes of entropy being taken into account* 
In the problem of a parallel supersonic flow of Mach number i-8 impinging on a blunt- 
nosed axially symmetric obstacle, the new technique is used to determine the complete 
field downstream of the bow shock wave formed. Lines of constant vortidty and Mach 
number are shown in the field, and where possible a comparison is made with the 
corresponding 2-dimensional problem. 

i. Introduction 

In supersonic flow past a symmetrically placed infinite cone, the flow 
behind the shock wave forms a conical field (Busemann, 1929) provided 
the shock surface is attached to the tip of the cone. The shock, which is a 
co-axial circular cone, is everywhere of constant strength and so the flow 
behind the shock wave is irrotational. The physical characteristics of such 
a flow have been theoretically determined for the possible range of incident 
Mach numbers and cone angles by Hantzsche and Wendt (1942) and 
Kopal (1947). If, however, the boundary conditions are such that an 
attached shock is impossible, the conical nature of the field is completely 
destroyed, and no theoretical solution to 9 uch a problem seems likely in 
the near future. 

The attempts so far made to obtain a numerical solution of the field 
behind a bow shock wave in axially symmetric flow include those of 
Maccoll and Codd (1945) and Drougge (1948). In these investigations 
where the forward Mach number ranged from 1-50 to 2-15, relaxation 
methods were used to evaluate the subsonic region, and isentropic con¬ 
ditions were assumed downstream of the shock wave. 

In the present paper, Southwell's relaxation technique is modified and 

* This paper was assisted in publication by a grant from the Carnegie Trust for the 
Universities of Scotland. 



372 A, R . Mitchell and Francis McCall, The Rotational Field 


used to determine the complete mixed subsonic-supersonic rotational 
field downstream of the detached bow wave formed when a uniform 
parallel flow of Mach number 1*8 impinges on a square-nosed axially 
symmetric obstacle. At such an incident Mach number, an attached 
shock wave is only possible up to a semi-cone angle of 37°. The method 
of treatment is substantially similar to that employed in the corresponding 
2-dimensional problem by one of the present authors (1951)1 and to avoid 
repetition only the main differences will be described in detail. 


2. The Fundamental Equations 

Cylindrical polar co-ordinates x t r are used, x being measured along 
the axis of symmetry and r perpendicular to it. The components of the 
velocity q in these two directions are u , v respectively. The stream 
function ip is then defined by the equations 


pr or 


1 dip 

- -- 

pr bx 


where p denotes the density. 

The condition which ip must satisfy for non-viscous, steady, com¬ 
pressible flow is (Vazsonyi, 1945) 


8 x\pr 8 x 1 8 r\pr 8 r) R ’ 


where p, S denote the pressure and entropy respectively and R is the gas 
constant. This reduces to 

(,) 

, 0 » 8 * 

where x—P~* and V* denotes-1—. 

8 x* 8 r* 

The flow behind a shock wave is adiabatic, but not necessarily 
isentropic. The entropy, however, maintains a constant value along each 
streamline, and so Bernoulli's equation for the rth streamline behind the 
shock wave may be written 




where c, denotes the velocity of sound at a stagnation point, (p,) r the value 
of the stagnation density on the rth streamline, and y is the ratio of the 
specific heats. 



in Axially Symmetric Flow using Relaxation Methods 
In terms of x and 1(1, this equation becomes 


373 


( 3 ) 


Equations (2) and (3) are the fundamental equations for the axially 
symmetric field downstream of the shock wave. 

For convenience, these two equations may be written in the non- 
dimensional form, 


apd 


7rS'-^-¥UUt)\ 


( 4 ) 


( 5 ) 


To obtain these equations, A, a significant length in the problem under 
consideration, and 2 ttG, the mass flow per second under free stream 
conditions, have been introduced. With a slight change of notation 9 

V Jg p 

X* P* x 1 r now 8tan d for the non-dimensional quantities-, —, - 1 

( X')r G (p g \ 

X T 

where (x,) r and (p,) r are the stagnation values of x and p on the rth 

h h 

streamline behind the shock, and the flow parameter on this streamline 
has been expressed as 

G 

The pressure^ can be eliminated by using the dimensional adiabatic law 

L <£*h 


and equation (4) becomes 


y ddf 1 r* 1 dS 

VW-W - - ?+ -« =• “-<» 


( 6 ) 


r dr yv\ x >1+1 R 

The value of the flow parameter v T is given by one of the present authors 

095 1 ), 


Jt 




(y+i)4*+(y- 1 ) 

Pi 


Iv. 

v-l 


( 7 ) 


where v is the flow parameter in front of the shock, and p x , p% denote the 



374 R. Mitchell and Francis McCall, The Rotational Field 


values of the pressure upstream and downstream respectively of the shock 
wave on the rth streamline. 

3. The Relaxation Method 

The Southwell relaxation method (Green and Southwell, 1943) is used, 
and in all calculations the value 1-4 has been taken for y. 

A square network of non-dimensional mesh size a is used to cover the 
field, and the finite difference approximations to equations (5) and (6) are 
obtained as 

g Xo~ 4 - *o~ 1(1 * T) " ~ (y \# K + X - a ^A»)] W 

and 

'• s S C4); (,) 

where the suffixes o, i, 2, 3, 4 are as shown in fig. 1. Thus for a given 




to 



6 

z 

5 

■ 

3 

■ 

■ 


U 

■ 

■ 


TZ 


Fio. 1. 


^•distribution, /? 0 , x*. and hence the residuals F 9 can be calculated for each 
node of the net, using equations (8) and (9). The ^-distribution is now 
modified either by using a relaxation pattern, or by trial and error, in order 
to reduce the residuals. The changes in the residuals (J= o, 1,2,3,4) 

following a change 8^ in ^ are given by the formulie 








in Axially Symmetric Flow using Relaxation Methods 375 


and 

*Fi - |xo +- ^(4^0 - 2 ^‘) 


white 


^ (y-l)(v r )Jr W/nv, .... .. 

^“ 2A*f* ^ W]/» ,/■*» 3i 4i 


and where (2j has the following values for/= I, 2, 3, 4: 
Cl- 


C* — ” ~ AXAt -<l><d + ~t 

ar t a r t 

Qt- ~ 7^0-*A»)(0«- M 


(11) 


C 4 --^o-W.-W—. 

In deriving these formulae, it has been assumed that (v r )J and 



do not vary with the change 8^ in The consequent inaccuracy in the 
relaxation pattern is unimportant. 


4. The Problem 

The problem under consideration is that of a uniform parallel air 
stream of Mach number 1*8 flowing past a square-nosed cylindrical 
obstacle. The position and shape of the bow shock wave formed in front 
of the obstacle was obtained from a photograph taken in the N.P.L. 
(Holder ( North and Chinneck, 1949). The part of the field examined 
downstream of the shock wave extends from the axis of symmetry to the 
first streamline not appreciably deflected by the shock wave. At this 

streamline the shock wave makes an angle sin* 1 — with the line of flow. 

1*0 



376 A. R. Mitchell and Francis McCall , The Rotational Field 


In this problem, the length h is chosen to be five times the semi-width 
of the obstacle, and 2 irG to be the free stream flow per second through a 
circular cylinder of radius h. The first streamline not appreciably de¬ 
flected by the shock wave is found to be a distance 2h from the axis of 
symmetry. The shock wave cuts this streamline in the point A, the sonic 
line in B and the axis of symmetry in C. The obstacle is bounded by 
DEF. The free stream conditions of this problem are M= i*8o, /> = • 287, 
v— 1-262, R=- 0324, and i/r=r*. 



The field inside ABCDEF (fig. 2), where the rotational flow is mixed 
subsonic and supersonic, is now evaluated by the methods of the previous 
section. A brief summary of the method used need only be given, as it is 
similar to that described fully by one of the present authors (1951) in the 
evaluation of the similar 2-dimensional problem. The boundary con¬ 
ditions again consist of a knowledge of the stream function i/f along 
ABCDEF together with the Mach number M and the slope of the stream¬ 
lines on the downstream side of the shock. In particular, knowledge of the 


Pi 

pressure ratio — across the shock enables the flow parameter v f to be cal- 

Pi 

I 

culated from equation (7) for every streamline. The quantity- 

R 

on the downstream side of the shock wave is found from a knowledge of 




in Axially Symmetric Flow using Relaxation Methods 377 


the entropy increase — ( 5 r - S 0 ) across the shock wave at all points. 

and — are each plotted against 0 in fig. 3. These quantities are 

constant along each streamline, and hence fig. 3 can be used everywhere 
downstream of the shock. As in the 2-dimensional problem, the ratio of 
the flow parameters on the two sides of the shock is as great as 1-5 for the 
streamline which crosses the shock normally. 



* 

Fic. 3. 


Using an initial network of mesh size a =1/40, it is now possible to 
start applying equations (8) and (9) to a ^-distribution in the region 
ABCDEF . The value of R is calculated from equation (8) for each node, 
and from a graph of R against x the value of x is found. The residuals 
F 0 are then calculated from equation (9). Again the difficulty of keeping 
Jt below its maximum value of *068 is encountered at nodes near the sonic 
line. The method employed to overcome this difficulty and in so doing to 
locate the sonic line is explained fully by Rutherford and one of the present 
authors (1951). In the reduction of residuals at nodes away from the 
sonic line, ^ is modified using the relaxation pattern. At nodes near the 
sonic line, however, a trial-and-error method of altering the stream function 
is again employed. 

The part of the field containing the sonic line and the comer is examined 
in more detail using a finer net of mesh size a = 1/80. The complete field of 
flow showing the lines of constant Mach number is illustrated in figs. 4 and 5 « 





Flo. 4. 



Fio. 5. 


ft 











Axially Symmetric Flow using Relaxation Methods 


379 


5. The Rotational Field 

The vorticity in the rotational field of flow behind the bow shock wave 
is given by Vazsonyi (1945), 


to — 


R dtp 


In non-dimensional form this equation becomes 


o» ■ 


pr 1 8 S 
yv r R dip' 


(12) 


where, with a slight change of notation, oi,p, r, and ifi now stand for the non- 
.. , (h\ I p \ r if> 

dimensional quantities ( - J, I — - 1 , , and — respectively. The quantity 

v./ \\Pi)rl n G 

/ I I dS\ 

( — o —) has a constant value along each streamline behind the shock. 
\yv r R dipl 

The non-dimensional vorticity is calculated from equation (12) at all nodes 
downstream of the shock, and lines of constant vorticity are shown in 
figs. 4 and 5. 

The value of the vorticity on the shock wave is of course independent of 

pr 

the field downstream of the shock, and since — increases with increasing ip 

along the shock, the vorticity is found to be a maximum after the point -of 
inflexion on the graph of ( S r - S 0 ) against ip. 


6 . Conclusion 

This paper shows that the mixed subsonic-supersonic rotational field 
behind an axially symmetric bow shock wave can be evaluated by means of 
an extension of Southwell’s relaxation technique. It again demonstrates 
that a solution to a problem involving a mixed elliptic-hyperbolic type 
differential equation can be obtained using finite difference approximations. 
Because of the non-linear nature of the fundamental equations, it has not 
been possible to examine the finite difference approximations for stability 
and convergence (O’Brien, Hyman, and Kaplan, 1951)- 

As in the corresponding 2-dimensional problem, the Mach number well 
downstream of the shock wave varies from about 1*42 on the obstacle side 
to i*8o on the first streamline not appreciably deflected by the shock, 
illustrating the importance of taking into account the rotational nature 
of the flow behind an axially symmetric bow shock wave. Due to the 



380 Axially Symmetric Flow using Relaxation Methods 


increased curvature of the shock wave ,near the axis, the point on the 
shock at which the vorticity is a maximum is much nearer the axis than in 
the 2-dimensional case. Again the streamline starting there is a locus of 
high vorticity. 


REFERENCES TO LITERATURE 

Busemann, A., 1929. “Driicke auf Kegelffirmige Spitzen bei Bewegung mit 
Oberschallgeschwindigkeit”, Zeits. Angew. Math. Meek., ix, 496. 

Drougge, G., 1948. “The Flow around Conical Tips in the Upper Transsonic 
Range”, The Aeronautical Research Institute of Sweden Report 25. 

Green, J. R., and Southwell, R. V., 1943. “ High Speed Flow of Compressible 

Fluid through a Two-Dimensional Nozzle”, Phil. Trans., A, ccxxxix, 
367-386. 

Hantzsche, W., and Wendt, H., 1942. “Supersonic Flow past Cones”, 
M.O.S. Translation , 124. 

Holder, D. W., North, R. J., and Chinneck, A., 1949. “Observations of the 
Bow-Waves of Blunt-nosed Bodies of Revolution in Supersonic Airstreams”, 
A.R.C. Report 12495. 

Kopal, Z., 1947. “Tables of Supersonic Flow round Cones”, Tech. Rep. i, 
Mass. Inst. Tech. 

Maccoll, J. W., and Codd, J., 1945. “Theoretical Investigations of the Flow 
around Various Bodies in the Sonic Region of Velocities ”, M.O.S . Theoretical 
Research Report 17. 

Mitchell, A. R., 1951. “Application of Relaxation to the Rotational Field of 
Flow behind a Shock Wave”, Quart. Journ. Meek. Appl. Math., iv, 371-383, 

Mitchell, A. R., and Rutherford, D. E., 1951. “Application of Relaxation 
Methods to Compressible Flow past a Double Wedge”, Proc. Roy. Soc . Edin., 
A, lxiii, 139-154- 

O'Brien, G., Hyman, M., and Kaplan, S., 1951. “A Study of the Numerical 
Solution of Partial Differential Equations”, Journ . Math . Phys xxix, 
223-252. 

Vazsonyi, A., 1945. “ On Rotational Gas Flows ”, Quart. Appl. Math., in, 29-37. 


(Issued separately October 20, 1952) 



( 381 ) 


XXVIII.— A Molecular $um Rule.* By D. ter Haar, Department 
of Natural Philosophy, University of St Andrews 

(MS. received June 19, 1952. Revised MS. received August 27, 1952. 

Read November io, 1952) 


Synopsis 

A derivation is given of a sum rule which may be useful in the discussion of line 
intensities of molecular spectra. 


In connection with a discussion of dissociation equilibria in interstellar 
space, Kramers and ter Haar (1946) came across the following sum rule. 
If A is the molecular orbital quantum number, A=o, 1, 2, . . . correspond¬ 
ing respectively to <7 , it, 8 , . . . states (compare Mulliken, 1932), we have 
the selection rules 4dA=o, ±1, and if the oscillator strengths are denoted 
by/, we have the following relations: 


2 /“—7-. 2 /-». 2 /“. 

AA--1 3 AA»0 AA-+1 3 


(1) 


of which the special case of A = i was used by Kramers and ter Haar (1946) 
in their discussion of the CH spectrum. 

It was originally planned to include the proof of (1) with other material, 
but this plan had to be abandoned. As the general relations (1) do not 
seem to be generally known while they may be of interest to molecular 
spectroscopists, and as the derivation of these relations is an excellent 
example of the elegant methods which were so characteristic of Kramers, 
I felt that it might be worth while to reproduce the derivation of (1) as it 
was outlined to me by Kramers when we were preparing our paper on 
the conditions in interstellar space. 

The sum rule (1) can be applied whenever we are dealing with mole¬ 
cules where (a) the assignment of A-values is justified, (i) we are 
interested in transitions involving one electron only. Examples are H a 
(one of the 1 so electrons), HeJ (the 2sa electron), CH (the 2pit electron) 
and so on (see Mulliken, 1932, for the meaning of the symbols). 

It may be remarked here that a similar derivation can be given of 
the sum rules applying to atomic quantum numbers and that (t) can be 
generalised to other cases. 

* This paper was assisted in publication by a grant from the Carnegie Trust for the 
'Universities of Scotland. 



382 


D. ter Hoar 


Consider a system with cylindrical symmetry. The Hamiltonian 
operator H will be of the form 

H-Ofc .)-<££, (.) 

where p, g and x are cylindrical co-ordinates, and where 

C-A*/®*** ( 3 ) 

(A: Planck's constant; m: electron mass), and where 

s* 1 an 

Q(P,«)--c[- + - + --j + u (p..). (4) 

with U the potential energy. 

The stationary states of the molecule are characterised by two quantum 
numbers, n and A, and the corresponding (normalised) characteristic 
functions are of the form 

*> x)" *) eiAx - ( 5 ) 

The energy levels E m satisfy the equation 

( 6 ) 

The completeness relation for the has the form (Kramers, 1938) 

2$U<P» *0 "[ppl "* 8 (P '•(* -*0» (7) 

n 

where 8(*) is Dirac’s S-function (Dirac, 1935). 

The oscillator strength /„., corresponding to a transition from a state a 
to a state P is given by the formula (Bethe, 1933) 

I^l*, (8) 

where e is the electronic charge and where the P af are the matrix 
elements of the electric polarisation. 

In our case we are interested in the following three sums: 

*5»" 2 A »)*,«'» *£fc — 2A»5 *±i.»'* ( 9 ) 

Expressing the operator of the electric polarisation in cylindrical co¬ 
ordinates and remembering the form of the characteristic functions (5), 
we see that we may use the following expressions for the polarisation 
operator: 

p a,a“«» * ± 1 - M* ^'» - fre* \ (*o) 



A Molecular Sum Rule 


3»3 


From equations (8) to (10) we get 

-( 3 CT 1 *01 

where the volume element d V is given by the equation 
dF—p dp di dg—do> d^. 


(11) 

( 13 ) 


( 13 ) 


Using equation (6), introducing the from equation (5), using 
equations (2) and (7) and integrating by parts, we get for So* 


s«- - ( 30 1 XjJd V d V'$\ „.(H - EJh 

- -CjO^IJda, do,' ft+CA*/p» -RnUfipT^ip-p 1 ) S(*-O 


“ (3O -1 j"p dp do ^.[£1 + CX'lp'-EMln- 


(14) 


Similarly we have for S ± the expression 

S ± - -(60 _1 Jpdpd*p^„[Q + C(\ ±i) «/p»- EM \.„• (IS) 

From equations (2) and (5) we have 

[fl+CAV-^.-o, (16) 

and we get from equations (14) to (i6) and (4) after some simplifications, 

S,--l|^-V A . n p d pdx, (17) 

s ± -- *j[=Fi*K A. + Jf dp do. (18) 

From these last two equations and the fact that the are 

normalised, equations (1) follow after integration by parts. 

* Primes on a function denote that the argument of the function should be primed, 



3»4 


A Molt color Sum RuU 


REFERENCES TO LITERATURE 

Bethk, H., 1933. Handb . Phys. } xxiv, 431. Berlin (Springer). 

Dirac, P. A. M. f 1935. Quantum Mechanics t 74. Oxford University Press. 
Kramers, H. A., 1938. Handu. Jahrb . Ckctn . Phys. 9 1, 129. Leipzig, 
Akademische Verlagsgesellschaft. 

Kramers, H. A., and ter Haar, D., 1946. Bull. astr. Inst. Netherlands, 
x, 137 (especially § 5). 

Mulliken, R. S., 193a. Rev. Mod. Phys* 9 iv, 1. 


(Issued separately December 4, 1952) 



C 385 ) 


XXIX.—The First Chemical Society, the First Chemical Journal, 
and the Chemical Revolution (Part II).* By James 
Kendall, M.A., D.Sc., LL.D., F.R.S., P. R.S.E. 

(MS. received September 13, 1952. Read November 10, 1952) 

Since the delivery of my presidential address (1) in July I have assembled 
an amount of supplementary information regarding “the Chemical Society 
instituted in the beginning of the Year 1785“. This, together with a 
brief description of some other chemical societies of the revolutionary 
period, forms the basis of the present paper. 

First of all, it will be expedient to furnish a complete list of the dis¬ 
sertations read before the Society during 1785-86 and included in the 
first volume of its Proceedings, appending short comments with respect 
to the communicators or their topics when anything of special interest arises. 

Dissertations Read before the Chemical Society Instituted 
in the beginning of the Year 1785 

I. On Chemical Attraction, by Mr William Scott. It is very appro¬ 
priate that the author of this first paper—he also contributed Paper 11, 
on Fermentation—should be able to proclaim himself, in his inaugural 
dissertation presented to the Faculty of Medicine in June 1786, “Extra¬ 
ordinary Member and President for the Year of the Chemical Society of 
Edinburgh". Despite his name, Mr Scott was an Irishman. His thesis, 
entitled De Acido atmospherico , sive aereo, is in all probability one of 
the two—Mr Carmichaell’s being the other—sent by Black to Lavoisier 
in 1790 as proof of the fact that Edinburgh students in general embraced 
his system. For it includes the following sentence: 

Sed tametsi phlogiston repudiatur j omnia quse sunt per regna nature dispersa, 
esse quandam mutationum chemicarum causam una voce conclamant; omnia, 
esse prindpium quod tantos inducere potest effect us, testantur; tale existere 
p rintipium doctissimus Lavoisier nuper indicavit.t 

This, coupled with the fact that Mr Scott makes it quite clear, in his 
paper presented “in the beginning of the year 1785", that he was even 

* This paper was assisted in publication by a grant from the Carnegie Trust for the 
Universities of Scotland. 

+ But although phlogiston is rejected, all things which are scattered through the 
realms of nature testify unanimously that there is a certain cause of chemical changes; 
all things bear witness that there is a principle which is able to produce so many results; 
the most learned Lavoisier has recently demonstrated that such a principle does exist. 



386 Janus Kendall, The First Chemical Society, 

then sceptical about the existence of phlogiston (i, p. 357), advances 
still further the date at which it may be presumed that Joseph Black 
was endeavouring in his courses to make his pupilB understand the new 
principles and explanations of the science of chemistry which Lavoisier 
had so happily invented (1, p. 354V Black must, indeed, have been their 
earliest advocate in Great Britain. 

2. On Mercury, by Mr William Halicjpy (spelt Halliday in Ramsay's 
list). Mr Haliday is another Irishman who announces on the title-page 
of his graduation thesis De Electricitate Medica in 1786 that he is an 
Extraordinary Member and President for the Year of the Chemical 
Society. Evidently this Society, like the Royal Medical Society founded 
in 1737, adopted the custom of electing several presidents (usually four) 
annually (2). Mr Haliday’s paper on Mercury contains a description of 
the different results obtained by subliming Mercurius precipitatus per se 
(our modern mercuric oxide) in vessels exposed to the action of the air, 
when "it forms crystals of a beautiful deep red colour like rubies", and 
by distilling it in vessels perfectly closed, when "it becomes flowing 
mercury again". He sagely concludes: 

Thus is cleared up a seeming contradiction to be met with among the chemists, 
some of them asserting that the Calx is not reducible without the addition of 
Phlogiston, others of no less character denying this. The precipitate per se is a 
true Calx, as it is combined with pure Air and has acquired weight by the operation. 

Some later remarks are not so happily worded, but in a subsequent 
communication (Paper 16) Mr Haliday also discloses that he inclines 
towards the new theory of combustion. 

3. On Nitre, by Mr Samuel Black. Yet another Irishman who 
presented an inaugural dissertation, De Ascensu Vaporum spontaneo, to 
the Faculty of Medicine in 1786 and declares himself President for the 
Year of the Chemical Society on its title-page. It is to be hoped that 
Mr Samuel Black was no kinsman of his famous namesake, for he hesitates 
to pronounce any preference between the old and the new theories of 
combustion (1, p. 357), neatly evading the issue by stating, "I shall 
therefore leave the Society to adopt that one which may appear most 
agreeable to right reason and to known chemical laws". And he closes 
his communication with the philosophical remark: 

When we see then how much these great and respectable men have imposed 
on themselves by indulging that propensity which mankind so universally dis¬ 
covered—inquiring into ultimate causes—let us learn wisdom from their errors. 
Let us not be carried away by theory any farther than we have data to support us. 

Mr Black also contributed Paper 12, on .Evaporation. 



( 

First Chemical Journal, and the Chemical Revolution (Part II) 387 

4. On Vitriolic Acid, by Mr Henry Johnston. He begins by saying 
that every person who is a member of this Society ought to contribute 
something for its support. He does not entertain any idea that his contri¬ 
bution contains anything new, and he is perfectly correct in that opinion, 
since he is an out-and-out phlogistonist. Sulphur, for him, is still a 
compound of vitriolic acid with phlogiston. Mr Henry Johnston is also 
the author of Paper 14. 

5. On Opium, by Mr William Johnston. Also the author of Paper 20. 
His contributions, like those of his precursor Henry, are purely descriptive. 

6. On Tartar, by Peter Gernon. Peter Gemon is the only original 
member dMie Society whose name does not appear on the list of students 
registered in Joseph Black’s class of Chemistry. It is possible that his 
name has been incorrectly transcribed on Ramsay's sheet, or incorrectly 
deciphered by Ramsay, since several errors in other names will be noted 
in the course of this enumeration, but I have not located any likely 
alternative. 

7. Some Remarks on Arsenic, by Mr Robert Ross. No remarks; 
another phlogistonist. 

8. Chemical Account of Alkaline Substances, by Mr L. L. Van Meurs 
(L. van Meurs in Ramsay’s list). This “amteemann batavis” apologises 
for his want of ability to express himself in a foreign language, but he 
has written by far the longest (47 pages!) and perhaps one of the dullest 
papers in the whole volume. 

9. On Lead, by Mr Biker M‘Donald (spelt Bicker McDonald in 
Ramsay’s list). A fourth Hibernian to style himself Extraordinary 
Member and President for the Year of the Chemical Society of Edinburgh 
in his thesis presented to the Faculty of Medicine in 1786. Also the 
communicator of Paper 13. 

10. On Latent Heat, by Joshua Parr. This paper naturally teems 
with references to Joseph Black’s pioneer research on the subject, carried 
out when he was Lecturer in Chemistry at Glasgow University twenty-five 
years previously, with the co-operation of his two most famous pupils, 
James Watt and William Irvine. So great was Black’s aversion to 
publicity that this work was never printed during his lifetime, although 
he read an account of it to a literary and philosophical society that met 
at the University on April 23, 1762. Only four years after his death 
did Robison include a full description of it, prepared from Black’s lecture 
notes and from a student's manuscript thereof, in his book Lectures on the 
Elements of Chemistry , by the late Joseph Black (3). Joshua Parr also 
confesses that he has -drawn his material directly from Black's lectures. 

The general results obtained by Black were, however, well known 



388 Janus Kendall, The First Chemical Society, 

to his contemporaries through his correspondence and through Watt’s 
inventions. It was by the application of Black's new ideas on ebullition, 
and with Black’s encouragement and generous help, that Watt developed 
his fundamental idea of economizing power by a separate condenser in 
the steam-engine. "It was this”, wrote Robison, "that spread the know¬ 
ledge of the doctrine of latent heat, and the name of Dr Black.” And 
McKie and Heathcote (4) have noted that it is established beyond all 
doubt that Lavoisier was already familiar with Black’s theory and method 
in 1772, yet his memoirs of 1777 and 1780 covering the same field do not 
even mention Black’s name. We cannot wonder that Robison was 
indignant; but Black remained imperturbable, as I have described in 
detail in my previous paper. The progress of chemistry, not personal 
glory, was his primary concern. 

Mr Parr is also the author of Paper 17, on Crystallization. 

11. On Fermentation, by Mr William Scott. Mr Scott flatly rejects 
the phlogistic explanation of fermentation: 

I consider the existence of phlogiston as merely ideal, and I perfectly coincide 
with the illustrious Bufion when he says that this phlogiston of the chemists is a 
being of their method rather than of Nature. . . . The phlogistic doctrine is in 
direct contradiction to all the laws of gravity and matter. 

He is much more sympathetic to Dr Lubbock’s theory of the Principium 
Sorbili —"one universal principle diffused through all nature, which is 
the chief agent in all its great operations, as combustion, calcination, 
fermentation, etc.”—but confesses: 

This principle is liable to the same objections that I have made to phlogiston. 
It may be said that the Principium Sorbili is equally hypothetical as the other, 
and may equally be considered as existing only in imagination. 

And although his final discussion remains rather confused, since he is 
"unwilling to hazard conjectures unaided by fact, and unsupported by 
experiments”, he shows rare acumen in the remark: 

From the experiments of Mr Cavendish and the reasoning of Mr Lavoisier, 
water appears to be a real compound. 

12. On Evaporation, by Mr Samuel Black. After criticizing a number 
of earlier and "altogether inadequate” theories regarding evaporation, the 
author of this paper proceeds thus: 

We come now to a much more pleasing and agreeable part of our task, to 
discuss the opinions of men whose names do honour to Science, . . . Among 
these our very learned Professor Dr Black holds first place. 



First Chemical Journal , and the Chemical Revolution (Part IT) 389 

There follows a long and interesting discussion of the work of Joseph Black 
on latent heat. 

13. On Iron, by Mr Biker M'Donald. Under the title of this com¬ 
munication Mr M'Donald has copied the couplet: 

How many dangers do Inviron 
The man who meddles with cold Iron I 

Hudibras. 

He contrives to avoid these dangers very adroitly, however, by keeping 
his paper almost entirely descriptive. 

14. Some few remarks on Silver, submitted to the candid examination 
of the Chemical Society by their fellow Member, Henry Johnston. 

15. On Water, by John Crumbie. This paper bears the postscript: 
Edin r , 26 th , Nov f —1785; the only direct reference to Edinburgh in the 
whole volume. 

Strangely enough, Mr Crumbie does not even mention Mr Cavendish’s 
recent discovery of the decomposition of water into its component parts, 
and still entitles it an element. If Mr William Scott (see Papers 1 and 11) 
occupied the President's chair at this meeting, I feel sure that he started 
the discussion with some very searching questions. 

16. Some remarks on the Calcination of Metals, submitted to the 
Discussion of the Chemical Society by their obliged Servt., W. Haliday. 
The artistic title-page of this paper and a brief extract hinting that Mr 
Haliday subscribes to the new theory of combustion have already been 
included in my presidential address (1). He details the objections to the 
old hypothesis and discusses the “beautiful experiments of Mr Lavoisier” 
at great length, but reveals that he has not completely freed himself from 
the phlogiston yoke. 

17. On Crystallization, by Mr Joshua Parr. The only extract worthy 
of quotation from this paper is the following: 

The doctrine of minute atoms I shall not intrude on the Society, as I do not 
think it worth their notice. 

18. On Nitrous Air, by Dr Anthony Mann. Dr Mann, another 
Irishman, graduated in the Faculty of Medicine in 1785. He quotes, 
but doubts at certain points, Priestley’s phlogistic explanation of the 
component parts of nitrous air. He himself suggests that it contains 
water. 

\ 

19. On Copper, by Mr John Gay. 

20. On Sugar, by Mr William Johnston. 

21. On Quick Lime, by Mr John Sedgwick. 

22. On Gold, by Mr William Lecky. Mr Lecky’s name does not 



390 


Janus Kendall, The First Cksmieal Society, 


appear on Ramsay's list of the fifty-nine original members; he presumably 
joined the Society after its inception. Some new blood was obviously 
urgently needed, in view of the frequent repetition of contributors manifest 
in sessions of the Society immediately preceding. Mr Lecky evidently 
justified his admission, since his medical graduation thesis in 1787 indicates 
that he was then President for the Year, the fifth Irishman to gain that 
distinction. 

23. On Phosphorus, by Mr James Mcllwaine (spelt McElwaine in 
Ramsay’s list). Yet another Hibernian, who also obtained his medical 
degree in 1787. After first stating Stahl’s theory of the combustion of 
phosphorus—“still held in repute by most modem chemists’’—and then 
describing Lavoisier’s recent experiments, which show that the large 
increase in weight on combustion corresponds exactly with that of the 
air absorbed, Mr Mcllwaine continues: 

I would hope that the Society will forgive my not giving a decided opinion 
in favour of either of them, as I am not yet so well acquainted with either as to 
know which of them I would give my preference to, but this I will leave to be 
decided by my ingenious fellow-members. 

What a pity it is that the discussion after the paper is not recorded! 

24. On Zinc, by Mr Thomas Burnside. And an Irishman again, 
who graduated in the Faculty of Medicine in 1786. 

25. On Lead, by Mr Thomas Gill. Our first Englishman, a medical 
graduate of 1787. 

26. On Coal, by Mr Richard (spelt Archd. in Ramsay’s list) Webb. 

27. On Magnesia, by Mr S. Latham Mitchill. Mr Samuel Latham 
Mitchill, like Mr Lecky, is not in Ramsay’s list, and must have been a 
new recruit to the Society. From the title-page of his inaugural dis¬ 
sertation for the degree of M.D. in 1786, we leam that he was an 
American. His later career may be summarized from an account by 
Edgar F. Smith (5): 

In 1793, he was elected to the Chair of Chemistry and Natural History in 
Columbia College. He opposed, in a very friendly way, the views of Priestley on 
phlogiston, and was the first teacher of chemistry in America to use the nomen¬ 
clature of Lavoisier [so Black’s influence spread also into the New World]. He 
founded the Medical Repository, the first paper in America devoted to general, 
as well as medical, science. He was probably the first American to write on 
chemical philosophy. He became in course of years an active member of nearly 
all the learned societies of the world [including the Royal Society of Edinburgh]. 
He was a sort of human dictionary whose opinion was sought by all originators 
and inventors of every grade. His ingenious theory of the doctrines of septon 
and septic acid gave impulse to Sir Humphry Davy’s vast discoveries [rather an 
exaggeration, since Davy’s disproof (6) of Mitchill’s “theory of contagion’’ was a 
very juvenile effort]/ He was a polished orator, a versifier and a poet, a man of 
.infinite humour and excellent fancy. 



First Chemical Journal\ and the Chemical Revolution (Part II) 391 

Two brief extracts from Mr Mitchill’s paper may be cited: 

(a) I am very glad to have on this subject an author of so much originality 
and accuracy as Dr Black to copy; his experiments upon Magnesia have thrown 
a new light upon this part of chemistry and are so decisive and numerous that 
they hardly leave anything for the experimenter to devise or the writer to invent. 

( b ) It has been thought by some practitioners most advisable to prescribe 
the Calcined Magnesia, because they thus prevent the flatulence and belching 
so apt to ensue when the fixed air is set loose in the stomach. 

After which, it is only fitting to quote again from Edgar F. Smith: 

His eccentricities furnished material for the wits of the day to fashion many 
a joke at his expense, over which no one laughed more heartily than himself. 

28. On Sulphur, by Mr Alexander Stevens. A phlogistonist who, 
at the end of his paper, requests the aid of his fellow-members in its 
discussion: 14 as it is a task I find myself entirely inadequate to". 

29. Camphor and Volatile Alkali, by Mr William Symonds. Mr 
Symonds makes some sarcastic remarks against those who have not yet 
seen sufficient reason to abandon the hypothesis of phlogiston. 

30. An attempt to point out some of the Consequences which flow 
from Mr Cavendishes Discovery of the Component Parts of Water, by 
Mr Thomas Beddoes. Mr Thomas Beddoes is the most famous of all 
the original fellows of the Chemical Society. He was born in Shropshire 
in 1760, attended classes at Edinburgh but graduated as M.D. at Oxford 
in 1786, and directed the Pneumatic Institution at Clifton from 1798 until 
his death in 1808 (7). A rare opportunity is presented him in this paper, 
and he throws it away. Perhaps he should not be judged too harshly, 
for Cavendish himself drew false conclusions from his own discovery, 
and remained a believer in phlogiston until his death. 

In general, Mr Beddoes in a straddler on the grand controversy of 
the period: 

I think it is common to both parties to assert more than they can prove. . . . 
If two different hypotheses can be adjusted to any set of appearances, both become 
uncertain, and that in much higher ratio than two to one. Perhaps we may 
conclude without much danger of error that they are both false, or at least 
imperfect. 

He expresses the highest admiration for Mr Cavendish's work: 

Priestley, Lavoisier and Laudriani have borne their testimony to the accuracy 
of one, whose own authority stands little in need of confirmation, since of him 
almost alone among modem philosophers it may be truly said that all his 
experiments have stood the test of repetition. 

Notwithstanding these direct experiments, many have felt and still feel a 
strong repugnance to admit the consequences of them, some perhaps because 
they have always been taught to look upon water as an uncompounded body. 



392 Janus Kendall, The First Chemical Society, 

To these I can only say, that if they are sure that God has bestowed upon them 
faculties by which they can discover the properties of natural bodies without the 
aid of experience, they do right in persisting. 

He believes that Mr Cavendish has given phlogiston a reprieve (he will 
not say whether short or lasting): 

Many operations can now be connected with the existence of phlogiston, 
which had admitted of no explanation before. Indeed it seems to me that but 
for this discovery, phlogiston must have been totally abandoned. When phos¬ 
phorus is burnt in vital air, and the whole of it disappears, what account could 
have been given of this, different from Mr Lavoisier’s, if we did not know that 
the production of water will account both for the diminution of the air and the 
increase in weight of the consumed body. 

And, before wandering off to more important problems, such as how 
a hen introduces air into an egg and a conjecture concerning the use of 
manure, he simplifies the difficulty he has discussed above of two con¬ 
flicting hypotheses by volunteering a third, hinted at (so he says) by 
Dr Black, which involves the assumption that Mr Cavendish’s analytical 
experiments are, after all, illusory. He adds naively: 

If it shall appear that his [Mr Cavendish’s] observations have been inaccurate 
or imperfect, I shall follow the light of truth, but I shall look back upon the delusion 
with an eye of regret. 

It is not surprising that Dr Beddoes, after Humphry Davy had brought 
his Pneumatic Institution into world-wide repute by his experiments on 
nitrous oxide, frittered away the rest of his life advocating drastic reforms 
in education, diet, dress, children’s toys, and innumerable other topics. 
He corresponded frequently with Joseph Black after he left Edinburgh (8). 
In April 1789 he informed Black that Dr Priestley had totally overthrown 
the French chemists; two years later he wrote, “I am glad to see your 
renunciation of the old Chemical Theory in the Annales de Chytnie"', 
in 1798 he encouraged Davy to publish a puerile attack on Lavoisier’s 
doctrines (6, pp. 10-13). “As little fitted for a Mentor as a weather-cock 
for a compass"—Davy never wrote a more apposite epigram. 

31. Some account of the Theories of Combustion, of Heat, of Light, 
and of Colour, by Mr John Carmichaell (spelt Carmichel in Ramsay’s 
list) It is significant that Mr Carmichaell is the first known Scot to con¬ 
tribute to the Proceedings of the Society; how noteworthy his contribution 
was has been discussed in detail in my previous paper (1, pp. 350-352). 

32. On Dephlogisticated Air, by Mr George Kirkaldie. Mr Kirkaldie 
is a second Scot, who obtained his medical degree in 1786, but he is not 
of the calibre of Mr Carmichaell. It is far from being his intention to 
enter upon the phlogiston controversy, and he merely proposes some 
observations on its nature for the consideration of the Society. He 



First Chemical Journal, and the Chemical Revolution {Part II) 393 

criticizes the views of Mr Cavendish with regard to dephlogisticated air 
at great length, but his own suggestions are even more unsatisfactory. 

This paper concludes the first volume of the Proceedings of the Chemical 
Society instituted in the beginning of the Year 1785. 


The Irish Question 

The predominance of Irishmen among the communicators is truly 
astonishing. Of the fourteen contributors whose nationality is known, 
eight are Irish (and five of these became presidents of the Society), two 
Scots (one also was elected president, and compensates in quality for the 
lack of quantity), two English, one Dutch, and one American. Several 
of the original members who failed to find their way into the pages of the 
Proceedings subsequently submitted inaugural dissertations in the Faculty 
of Medicine, thereby disclosing their country of origin. They are listed 
in alphabetical order below, with their year of graduation. 

1. John Barrow, England, 1785. 

2. James Donovan (spelt Donavan in Ramsay’s list), Ireland, 1784. 

3. Nicholas Elcock, Ireland, 1786. 

4. Edward Fairtlough (spelt Fairclough in Ramsay’s list), Ireland, 1785. 

5. James Forster, Ireland, 1785. 

6. Thomas Galley, England, 1785. 

7. Samuel Macay (spelt Macoy in Ramsay's list), Ireland, 1785. 

8. Cornelius Pyne, Ireland, 1785. 

9. William Robertson, Scotland, 1786. 

10. John Unthank, Ireland, 1784. 

11. John Watson-Sproule (spelt Sprole in Ramsay’s list), Ireland, 1787. 

Examination of all the medical theses submitted at the University 
of Edinburgh during this period has revealed a number of additional 
graduates who announce their membership of the Chemical Society on 
the title-page of their dissertations. They also are listed in alphabetical 
order below. The numbering in this list starts with 62, since the original 
59 members, plus Mr Lecky and Mr Latham Mitchill, have precedence. 

6s. William Allanby, England, 1788 (President for the Year). 

63. Samuel Crumpe, Ireland, 1788 (President for the Year). 

64. William Jones Evans, Ireland, 1788. 

65. Alexander Jackson, Ireland, 1787. 

66. Charles Johnston, Ireland, 1785. 

67. William Saunders O’Halloran, Ireland, 1788. 

68. Thomas Ren wick, England, 1787. 

69. Joseph Sherlock, Ireland, 1788. 

70. James Short, Scotland, 1788. 

71. John Ussher, Ireland, 1785. 

7a. William Whitelaw, Ireland, 1786. 



394 


Janus Kendall , The First Chemical Society, 

The final state of the poll, with the nationality of exactly one-half of 
the total membership established, is therefore: Ireland 24 (6 presidents); 
England6(1 president); Scotland4(1 president); Holland 1; America 1. 
Our idea of the typical Edinburgh student of the eighteenth century, 
trudging to and from his home during the Meal Monday holiday to 
replenish his sack of oatmeal, obviously needs a little recasting. 

With the year 1788 the Society appears to come to a sudden close, 
for I have continued my examination of the title-pages of the Edinburgh 
inaugural dissertations as far as the end of 1794 without encountering a 
single further graduate who proclaims himself a member of the Edinburgh 
Chemical Society. Since every one of the original members took special 
pride in this distinction, none qualified to do $0 omitting to state the fact, 
it is unfortunately clear that the first Chemical Society in the world faded 
out in its fourth year. 

The year 1789, however, is not entirely lacking in interest, for one 
medical graduate, John Benjamin Jachmann, breaks new ground, styling 
himself an Honorary Fellow of the Chemical Society of Glasgow 1 As his 
name suggests, this solitary intruder from the west is also not a native 
Scot; he claims Prussian nationality. Dr Jachmann must have been a 
man of some distinction, for he was elected a President of the Royal 
Medical Society in the same year (2, p, 317), but I must leave it to my 
Glasgow colleagues to discover more about him. It will be relevant here, 
nevertheless, to give a brief account of this Glasgow society, as well as 
of some others of the period. 

Other Eighteenth-Century Chemical Societies 

The history of the Chemical Society of Glasgow was unearthed by 
Dr J. A. V. Butler, formerly Lecturer in Chemistry at the University of 
Edinburgh, while working at Princeton University on leave of absence 
in 1941 (9). The first Professor of Chemistry at Princeton—then the 
College of New Jersey—was John Maclean, M.D., a Glasgow graduate. 
His son, John Maclean, became the tenth President of Princeton, and 
printed for private circulation a memoir of his father in 1876. In that 
memoir it is stated; 

At the University he was, while yet a lad, a member of the Chemical Society, 
a club which appears to have met at the University, with the permission of die 
College authorities, if not under the oversight of the Professors. Hie members 
submitted, for the consideration of the Society, papers and essays upon various 
matters connected with the object of their association, and some of these papers 
seem to have foreshadowed the eminence which the authors of them attained in 
after life, as proficients in the art of Chemistry. 



First Chemical Journal, and the Chemical Revolution (Part It) 395 

Other members of this Society were William Couper, Charles Macintosh 
(the inventor of the waterproof coat; see, however, p. 397), Mr Candlish, 
Dr Tilloch, Dr Crawford (a chemist of some repute) (10), Mr John Wilson, 
Major Finlay, Mr Cruikshank, Mr Archer and Mr Monroe. One of 
seven papers which Maclean himself read before the Society, dated 
March 29 [1786?], contains a reference to Dr William Irvine, Joseph 
Black’s former collaborator (see p. 387), Lecturer in Chemistry at Glasgow 
University, who presumably sponsored its activities. Irvine died in the 
summer of 1787, and in October of that year Dr A. Eason of Manchester 
remarked in a letter to Charles Macintosh: “How does your chemical 
society get on? I am afraid, badly, since poor Irvine is gone. However, 
you must keep it up.” 

The precise date of the institution of the Glasgow Chemical Society 
is not known. It was most probably early in 1786, with late 1785 as a 
possibility, so that it runs as a very close second to Edinburgh. The exact 
date of its demise is also uncertain, but it was probably this same Society 
to which Mr John Benjamin Jachmann belonged in 1789. For in the bio¬ 
graphical notice of Dr John Thomson, Professor of Medicine in the 
University of Edinburgh, his son Dr William Thomson states (11): 

At the beginning of the winter session 1788-89, Mr John Thomson went to 
Glasgow to attend the medical classes. Besides prosecuting the study of anatomy 
with ardour, he attended the lectures of Dr Cleghom, who was a lecturer on 
chemistry in the college, an office which had been successively held by Cullen, 
Black, and Irvine. He also joined a chemical society, which contained several 
members who afterwards attained great eminence as practical chemists. The 
doctrines of Lavoisier had just been made known, and gave much interest to the 
proceedings of young and ardent cultivators of chemical science, among whom 
it may be supposed that they found a readier reception than among those who, 
before adopting the new doctrines, had previously to unlearn the old. 

The last sentence of this extract might have been written by Joseph Black 
himself (1, p. 354). It is not surprising to find that John Maclean, soon 
after he arrived in Princeton, engaged in a controversy with Priestley on 
the phlogiston theory (6, p. 232). 

A second chemical society, however, was founded in Glasgow in 1798; 
its president in 1800 was William Ramsay, grandfather of Sir William 
Ramsay. The minute book of this society for the period October 8, 1800 
to March 25, 1801, is now in the possession of the Royal Philosophical 
Society of Glasgow (12). It consists almost entirely of a record of experi¬ 
mental work, by the society, apparently carried out at its meetings, which 
were held generally more frequently than once a month. 

Simultaneously we find Dr John Thomson (11, p. 16 of Introduction) 
in Edinburgh converting a Natural History Society, probably at the 



396 James Kendall, The First Chemical Society, 

Royal College of Surgeons, into a Chemical Society, of which Lord 
Brougham and Lord Lauderdale were members. 

There were thus, before the end of the eighteenth century, no fewer 
than four chemical societies instituted in Scotland—a truly admirable 
manifestation of the clan spirit. It is, furthermore, not unreasonable to 
regard the Chemical Society of Philadelphia as a daughter society of that 
at Edinburgh in 1785. The Medical School of the College of Philadelphia 
(now the University of Pennsylvania) was instituted in 1765 under strong 
Edinburgh auspices, and the coat of arms of the University of Edinburgh 
is still to be seen above the entrance to one of its original buildings. 
John Morgan, who first taught chemistry there, and Benjamin Rush, 
who succeeded him in 1769 as the first full-time Professor of Chemistry 
in America, were both students of Joseph Black in Edinburgh. With 
such an intimate connection between the two centres, the existence of 
a chemical society at Edinburgh University, sponsored by Black, would 
certainly be a matter of common knowledge in the College of Philadelphia, 
and when James Woodhouse (then a young man of 22) founded the 
Chemical Society of Philadelphia in 1792 he may quite plausibly be 
pictured aB following, consciously and deliberately, in the footsteps of 
Joseph Black. 

At least two eighteenth-century chemical societies also existed in 
England, but little definite is known about them. One must have 
flourished in Manchester well previous to 1787, since Dr Eason’s letter 
to Charles Macintosh of that date, quoted on p. 395, contains the sentence: 

' 1 We intend another volume of our transactions by next spring.” Whether 
this good intention was fulfilled we know not, but what a wonderful 
discovery the first volume would be! The only reference to the second 
society is in the Journal of John Playfair, Robison's successor as Professor 
of Natural Philosophy in the University of Edinburgh (13). It was 
brought to my attention by an Aberdonian, the late Professor J. C. 
Philip. 

Chemistry is the rage in London at present [1783]. I was introduced by 
Mr B. Vaughan (with whom I became acquainted in Edinburgh while he studied 
at the University there) to a chemical society, which meets once a fortnight at 
the Chapter Coffee-house. Here I met Mr Whithurst, Dr Keir, Dr Crawfurd, 
and several others. 

Playfair goes on to record how he also saw Dr Priestley, "who has 
made so great a figure in the world”. Priestley was then "particularly 
engaged in some experiments to prove that inflammable air is the same 
thing with phlogiston”, but Playfair was too canny a Scot to be s 
convinced. 



First Chemical Journal , and the Chemical Revolution {Part II) 397 

If only this society were not so nebulous, it might well deprive the 
Chemical Society of Edinburgh of its priority. My own surmise regarding 
it is as follows. It will be noted that the London society to which the 
Edinburgh professor was introduced had a suspiciously Caledonian 
flavouring—Mr B. Vaughan was certainly one of Black's students, and 
two of the other three members named by Playfair are obvious Scots. 
I suspect that it possibly constituted an offshoot—the London branch— 
of a hypothetical older Edinburgh Chemical Society, for Mr Thomas 
Beddoes opens his communication to the Proceedings of the "society 
instituted in 1785” (the very title insinuates a precursor instituted in an 
earlier year) with the cryptic words: 

Mr President: This Society, as well as others of older date and greater name, 
has often witnessed the doubts and difficulties that divide and perplex the chemists, 
concerning the nature and product of phlogistic processes. 

To what societies is Mr Beddoes here alluding? The only definite 
suggestion I have to make is the Royal Medical Society, of which Mr 
Beddoes was President in its forty-ninth session, 1785-86. The minute- 
books of the Society for this period have been lost, but it is known (2, p. 62) 
that in 1785 a committee was appointed to supervise the fitting up of a 
laboratory for chemical experiments and, although the main interests of 
the Royal Medical Society were not strictly chemical, it must have 
held some discussions on the chemical revolution and its medical 
implications. 

Obviously there is plenty still to be discovered by the diligent historian. 
One point that has just reached my notice is that the Chemical Society 
of the University of Edinburgh was resuscitated in 1815 by James Syme. 
Alexander Miles (14) states: 

When he [Syme] became a member of Dr Hope's class in the University, he, 
along with Robert Christison and a dozen other students, founded a chemical 
society which met once a week. 

And Sir Robert Christison wrote long afterwards (2, p. 126): 

It very nearly made a chemist, instead of a surgeon, of Syme. Before it came 
to an end by the dispersion of its members, he had begun to work at the subject 
of the solvents of india-rubber, and his inquiries ended in his discovering its 
solubility in coal-tar naphtha, and the waterproofing of doth by means of this 
solution. He published his discovery at a very early age. Nevertheless Macintosh, 
the manufacturing chemist, reaped all the honour as well as the profit. 

This, however, is a digression from the eighteenth century, to which 
period I now return. 



398 /amts Kendall, The First Chemical Society, 

The Dissertations sent by Black to Lavoisier in 
October 1790 

The number of inaugural dissertations “in which chemical subjects 
were chosen” at Edinburgh during the period 1785-90 is much fewer 
than I anticipated, and there is little doubt that the two sent by Black 
to Lavoisier as proof that the students of the University of Edinburgh 
“in general embrace your system and begin to make use of the new 
nomenclature” were: 

1. De Aeido atmospherico , sive aereo: William Scott, June 1786. 

2. De Fermentatione: John Carmichaell, September 1787. Extracts 
from these dissertations have already been quoted on p. 385 and in my 
previous paper (1, p. 356). 

The only other possibilities are: 

1. De Duo bus Aeris speciebus aquam gignentibus: George Kirkaldie, 
September 1786. This, however, contains nothing of interest to Lavoisier. 

2. De lgne: Hugh Gillam, September 1786. Mr Gillam was not a 
member of the Chemical Society, and his thesis is officially placed in 
the physics section. It does include a phrase, “Quorum primus init 
preelia Lavoisier, chemise peritissimus ipse, et cteteri sequuntur”, which 
would have been pleasant for Lavoisier to read, but I do not think Black 
would send to Lavoisier a thesis issued from another department. 

3. De Compositions Acidi Sulphurici: Alexander Purcell Anderson, 
September 1790. The date probably eliminates this, since it is unlikely 
that printed copies would be available. Lavoisier is mentioned as 
“vir ingeniosus”, but the theoretical aspects of the topic are scarcely 
touched. 

The translation of Black’s letter to Lavoisier in the Annates de Chimie 
(15) carries a footnote regarding the two inaugural dissertations sent 
therewith: “On en donnera incessament l’extrait dans ce Journal.” 
A careful search of succeeding issues, however, failed to find these promised 
summaries. It must be assumed that they disappeared in the turmoil of 
the Revolution. 

The paucity of chemical theses submitted at Edinburgh in this period 
presumably springs from Black's increasing age and infirmity. One 
other Edinburgh adherent of Lavoisier’s doctrines, however, deserves 
brief notice here, although I have not traced him as a student of 
Joseph Black. His name is Robert Kerr, and a letter from him to 
Lavoisier, written on January 21, 1791, is among those recently discovered 
by Dr Douglas McKie in the Archives de France (see I, pp. 353-354). 



First Chemical Journal, and the Chemical Revolution (Part II) 399 

In this letter (16) Mr Kerr first refers to the discussions which the 
Royal Society held in 1788 on the subject of phlogiston. There is no 
record in the Transactions of the Society as to what transpired at the 
five sessions devoted thereto, but Mr Kerr informs Lavoisier that "Sir 
James Hall Baronet had given our Edinburgh Royal Society a very 
Ingenious detailed account of your doctrines which had shaken the phlo¬ 
gistic faith of many and even made several converts, amongst whom I was”. 
Struck with the elegant simplicity of Lavoisier’s theory, Mr Kerr immedi¬ 
ately set about translating his Elemens de Chymie "in the hopes that it 
might prove an useful present to my countrymen". This altruism (or 
good business instinct) was properly rewarded, for although he hazarded 
a large edition, his bookseller now advises him that he must speedily 
prepare for reprinting. He accordingly presents Lavoisier with a copy 
of the translation, and requests instructions of any alterations or additions 
he would wish to have made in the next edition. 

Lavoisier’s answer has not been preserved, but it is interesting to note 
that four other editions followed between 1793 and 1802. 

Joseph Black 

This paper can most aptly close with a tribute to Joseph Black, the 
sponsor of the First Chemical Society, the inspircr of the contributors to 
the First Chemical Journal and, as now appears, in all probability the 
fint chemist in Great Britain to join in the chemical revolution. How 
reverenced this grand old man of chemistry was by his colleagues is 
evinced by Lavoisier's letter to him (1, p. 353). How stimulating he was 
to his students may be judged by the following eulogy from the pen of 
Lord Brougham; 

I have heard the greatest understandings of the age giving forth their efforts 
in its most eloquent tongue—have heard the commanding periods of Pitt's majestic 
oratory—the vehemence of Fox’s declamations—have followed the close-compacted 
chain of Grant’s pure reasoning—been carried away by the mingled fancy, epigram, 
juid argumentation of Plunket; but I should without hesitation prefer, for mere 
intellectual gratification ... to be once more allowed the privilege which I 
enjoyed a half century ago of being present while the first philosopher of his 
age was the historian of his own discoveries, and be an eye-witness of those 
experiments by which he had formerly made them, once more performed with 
his own hands. 

Fellows will forgive me, I am sure, for mentioning the fact that I am a 
direct descendant, in the scientific sense, of Joseph Black, since the Chair 
of Chemistry at the University of Edinburgh has passed without interrup¬ 
tion from professor to student for two centuries. I could not boast nobler 
ancestry. 



400 First Chtmical Journal, and th$ Chtmical Revolution {Pert II) 


REFERENCES TO LITERATURE 

(i) Kendall, 1953. Prot. Roy. Sot. Edits., A, lxiii, 346. 

(3) Gbay, 1953. History of the Royal Medical Society, Edinburgh University 
Press, 315. 

(3) Robison, 1803. Leitures on the Elements of Chemistry, by the late Joseph 

Black, Constable St Co., 1. 

(4) McKie and Heathcote, 1935. The Discovery of Specific and Latent Heats, 

Edward Arnold & Co., 47. 

(5) Smith, 1914. Chemistry in America, D. Appleton St Co., 148-150. 

(6) Kendall, 1939. Young Chemists and Great Discoveries, G. Bell & Sons, 11. 

(7) Stock, 1811. Life of Thomas Beddoes, M.D., Murray. 

(8) Ramsay, 1914. The Life and Letters of Joseph Black , M.D., Constable St 

Co., 93-98- 

(9) Butler, 1943. Journ. Chem. Educ., xix, 43. 

(10) Kent, 1950. An Eighteenth-Century Lectureship in Chemistry, Jackson, 

Son & Co., 145. 

(11) Thomson, 1859. An Account of the Life, Lectures and Writings of William 

Cullen, M.D. , William Blackwood St Sons, 8 (Introduction). 

(is) Wilson, 1937. Ann. Set'., it, 451. 

(13) Playfair, 1833. The Works of John Playfair, Constable St Co. (Appendix). 

(14) Miles, 1918. The Edinburgh School of Surgery before Lister, Black, 175. 

(15) Black, 1791. Annates de Ckemie, viu, 335. 

(16) McKie, 1949. Notes and Records of the Royal Society of London, vii, 13. 


(Issued separately Decetnber 4, 1953) 



INDEX 


Abel's Series, Asymptotic Validity, by A. J. 

MacIntyre and Sheila S. Marintyre, 212-231. 
Aitken (A. C.). Studies in Practical Mathe¬ 
matics: V. On the Iterative Solution of a 
System of Linear Equations, 52-60. 

-Studies in Practical Mathematics. 

VI. On the Factorization of Polynomials by 
Iterative Methods, 174-191. 


-Studies in Practical Mathematics. 

VII. On the Theory of Methods of Facto Hz- 
mg Polynomials by Iterated Division, 
3 * 6 - 335 - 

Artificial Holograms and Astigmatism, by 
G. L. Rogers, 313 - 325 - 

Astigmatism, Artificial Holograms and, by 
G. L. Rogers, 313-325. 

Asymptotic Validity of Series—Abel’s Series, 
by A. J. Macintyre and Sheila S. Marintyre, 
222-231. 


^-disintegration Energies: Heavy £-emitters, 
by N. Feather, 242-256. 

Bhattacharyya (A.). Unbiased Statistics with 
Minimum Variance, 69-77. 

Bow Shock Wave, The Rotational Field 
behind a, in Axially Symmetric Flow, using 
Relaxation Methods, by A. R. Mitchell and 
Francis McCall, 371-380. 

Brownian Motion, Simple Model based on a 
Generalization of the Classical Random- 
walk Problem, by G. Klein, 268-279. 


Craggs (J. W.). The Normal Penetration of 
a Thin Elastic-Plastic Plate by a Right 
Circular Cone, 359-370. 

Daniels (H. E.). The Statistical Theory of 
Stiff Chains, 290-3x1. 

Determinants, Some Continuant, in Physics 
and Chemistry: II, by D. E. Rutherford, 

232-241* 

Difference-differential Equations—the Linear, 
with Constant Coefficients, by E. M. Wright, 
18-26. 

Differential Equation, Second-Order Linear, 
with Periodic Coefficient, having Finite 
Singularities, by Enzo Cambi, 27-51. 
Diffraction Microscopy, Experiments in, by 

G. L. Rogers, 191-221. 

Double Wedge, Application of Relaxation 
Methods to Compressible Flow past a, by 
A. R. Mitchell and D. E. Rutherford, 
1 39 -i54. 

Elastic-PI as tic Plate, Normal Penetration by a 
Right Circular Cone, by J. W. Craggs, 
359 - 370 . 

Electrodynamics, Reciprocity Theory of, by 

H. S. Green and K. C. Cheng, 105-138. 
Electron-capture Process, Sargent Diagram 

for, by N. Feather, 242-256. 

Ecmation, Solution of a Functional, by A. H. 

Read, 336 - 345 - 


Cambi (Enzo). The Simplest Form of 
Second-Order Linear Differential Equation, 
with Periodic Coefficient, having Finite 
Singularities, 27-51. 

Chemical Journal, The First, by J. Kendall, 
Part I, 346-358; Part II, 385-400. 

-Revolution, by J. Kendall, Part I, 

346-358; Part II, 385-400. 

-Society, The First, by J. Kendall, 

Part I, 346-358; Part II, 385-400. 

Cheno (K. C.). Si§§ Green (H. S.) and 
Cheng (K.C.). 

Circular Cone, Normal Penetration of a Thin 
Elastic-Plastic Plate by a Right, by J. W. 

cSSRSSS Symbols and the Theory of 
Symmetric Functions, by H. W. Turnbull 
and A. H. Wallace, 155-173. 

Compressible Flow past a Double Wedge, 
Application of Relaxation Methods to, by 
A. R. Mitchell and D. E. Rutherford, 


Continuant Determinants arising in 
and Chemistry, by D. E. Ru 


Physics 

therford, 


23^-241. 

Co-variance, On the Estimation of Variance 
and, by E. H. Lloyd, 280-289. 


Factor Analysis, A Further Note on a Problem 
in, by D. N. Lawley, 93-94. 

Factorization of Polynomials by Iterative 
Methods, by A. C. Aitken, 174-101. 

Factorizing Polynomials by Iterated Division, 
by A. C. Aitken, 326-335. 

Feather (N.). The Sargent Diagram for the 
Electron-capture Process, and the Dis¬ 
integration Energies of Heavy -emitters, 
242-256. 

First Chemical Society, the First Chemical 

J ournal, and the Chemical Revolution, by 
. Kendall, Part 1 ,346-358; Part Hi 385-400. 
Frfchet (Maurice). Les Transformations 
asymptotiquement presque pdriodiques dis¬ 
continues et le lemme ergodique. (Premiere 
Note), 61-68. 

Functional Equation, The Solution of a, by 
A. H. Read, 336-345. 

Gabor Diffraction Microscopy: Experiments in 
Diffraction Microscopy, by G. L. Rogers, 
193-221. 

Green (H. S.) and Cheng (K. C.). The 
Reciprocity Theory of Electro-dynamics, 
105-138. 


401 



402 


Indtx 


Hologram*, Artificial Hologram* and Astig¬ 
matism, by G. L. Roger*, 313-325. 

-Experiment* in Diffraction Microscopy, 

by G. L. Roger*, 193-231. 

Houstoun (R. A.). A Meacurement of the 
Velocity of Light, 95-104. 

Hypothesis, Adventures of an, by James 
Kendall, 1-17. 


Iterated Division, Factorization of Poly¬ 
nomial* by, by A. C. Aitken, 326-335. 

Iteration, Solution of Linear Equations by, 
by A. C. Aitken, 52-60. 

Iterative Methods, Factorisation of Poly¬ 
nomial* by, by A, C. Aitken, 174-191, 


Kendall (James). The Adventure* of an 
Hypothesis, 1-17. 

-The Fint Chemical Society, the First 

Chemical Journal, and the Chemical Revolu¬ 
tion, Part I, 346-358; Part II, 385-400. 

Klein (G.). A Generalisation of the Classical 
Random-walk Problem, and a Simple 
Model of Brownian Motion based thereon, 
268-279. 

Lawlev (D. N.). A Further Note on a 
Problem in Factor Analysis, 93-94. 

Light, A Measurement of the Velocity of, by 
R. A. Houstoun, 95-104. 

Linear Equation*, Iterative Solution of, by 
A. C. Aitken, 52-60. 

Line Intensities; A Molecular Sum Rule, by 
D. ter Haar, 381-384. 

Lloyd (E. H.). On the Estimation of Variance 
and Co-variance, 280-289, 


McCall (Francis). Set Mitchell (A. R.) and 
McCall (Francis). 

MacIntyre (A. J.) and MacIntyre, Sheila (S.). 
Theorems on the Convergence and Asymp¬ 
totic Validity of Abel's Series, 222-231. 

Madntyre (Sheila S.). See Madntyre (A. J.) 
and Madntyre (Sheila S.). 

Microscopy, Experiments in Diffraction, by 
G. L. Rogers, 193-221. 

Mitchell (A. R.) and McCall (Francis). The 
Rotational Field behind a Bow Shock Wave 
in Axially Symmetric Flow, using Relaxation 
Methods, 371-380. 

Mitchell (A. R.) and Rutherford (D. E,). 
Application of Relaxation Methods to 
Compressible Flow past a Double Wedge, 
* 39 - 154 - 

Normal Penetration of a Thin Elastic-Plastic 
Mate by a Right Circular Cone, by J. W. 
Cnggi, 359 - 370 . 

Parallel Planes in a Riemannian V m , by H. S. 
Ruse, 78-92. 

Penetration of a Thin Elastic-Plastic Plate by 
a Right Circular Cone, by J. W. Craggs, 
359 - 370 . 


Polynomials, Factorisation of, by Iterative 
Methods, by A. C. Aitken, 174-191. 

—>-Factorising by Iterated Division, by 

A. C. Aitken, 326-335. 

Prime Number Theorem, Elementary Proof of 
the, by E. M. Wright, 257-267. 


Random-walk Problem, A Generalisation of 
the Classical, by G. Klein, 268-279. 

Read (A. H.). The Solution of a Functional 
Equation, 336-345. 

Reciprocity Theory of Electrodynamics, by 
H, S. Grech and K. C. Cheng, 105-138. 

Relaxation Methods, Application of, to Com¬ 
pressible Flow past a Double Wedge, by 
A. R. Mitchell and D. E. Rutherford, 


139-154. 

Rotational Field behind a Bow Shock 


Wave in Axially Symmetric Flow, by 
A. R. Mitchell ana Francis McCall, 271-380. 
Riemannian V nt Parallel Planes in a, by H. S. 
Ruse, 78-92. 

Rogers (G. L.). Artificial Holograms and 
Astigmatism, 313-325. 

-Experiments in Diffraction Microscopy, 

193-221. 

Ruse (H. S.). Parallel Plane* in a Riemannian 

y * 78-93. 

Rutherford (D. E.). Some Continuant Deter¬ 
minants arising in Physics and Chemistry; 
II, 232-241. 

- See Mitchell (A. R.) and Rutherford 

(D. E.). 


Spectroscopy: A Molecular Sum Rule, by 
I>. ter Haar, 381-384. 

Statistical Theory of Stiff Chains, by H. E. 
Daniels, 290-311. 

Statistics, Unbiased, with Minimum Variance, 
by A. Bhattacharyya, 69-77. 

Stiff Chains, Statistical Theory of, by H. E. 
Daniels, 290-311. 

Sum Rule, A Molecular, by D. ter Haar, 

381-384- 

Symmetric Functions, treated by Symbolic 
Methods, by H. W, Turnbull and A- H, 
Wallace, 155-173. 


ter Haar (D.). A Molecular Sum Rule, 
3S1-384. 

Theorem, Prime Number, Elementary Proof of, 
by E. M. Wright, 257-267. 

Transformations (Les) asymptotiquement pres- 
que p 4 riodique* discontinues et le lemme 
ergodique. (Premiere Note), by Maurice 
Fr 4 chet, 61—68. 

Turnbull (H. W.) and Wallace (A. H.). 
Clebsch-Aronhold Symbols and the Theory 
of Symmetric Functions, by H. W. Turnbull 
and A. H. Wallace, 155-173. 


Unbiased Statistics with Minimum Variance, 
by A* Bhattacharyya, 69-77. 



Index 


403 

Variance and Co-variance, On the Estimation Wright (E. MJ. The Stability of Solutions of 
of, by E. H. Lloyd, 280-289. Non-linear Difference-differential Equations, 

Velocity of Light, A Measurement of the, by 18-26. 

R. A. Houstoun, 95-104. -The Elementary Proof of the Prime 

Number Theorem, 257-267. 

Wallace (A. H.). S *4 Turnbull (H. W.) and• Zone Plate: Experiments in Diffraction 
Wallace (A. H.). Microscopy, by C. L. Rogers, 193-22l. 


pawns m qbxat Britain by dull akd 00. ltd., Bsmuanx 




X.AJLL79 

INDIAN AGRIOULTU RAL MBAMH ' 
IN S T I T U TE LIBRARY, NEW BBLHT. 


___ N _ 

Data of lattM Data of Ihoc Data taw 



OUPNLK—H-401.A.RI.—29-4-55—15,000 





Indian Agricultural 
Research Institute, New Dblhi, 

I. A. R. I. 6. 

MGIPO-flt—fll AR/57—3-4-08—0,000, 


